Talk:Free statistical software/Draft

From Citizendium
Jump to navigation Jump to search
This article has a Citable Version.
Main Article
Related Articles  [?]
Bibliography  [?]
External Links  [?]
Citable Version  [?]
To learn how to update the categories for this article, see here. To update categories, edit the metadata template.
 Definition This article briefly describes what free software is available for conducting statistical analysis of data. [d] [e]
Checklist and Archives
 Workgroup categories Mathematics and Computers [Please add or review categories]
 Talk Archive 1  English language variant American English

Just starting this article, feel free to jump in and help or give advice. Gene Shackman 04:55, 8 March 2009 (UTC)

add link to my review?

Would anyone be willing to add a couple of sentences to my review of free statistical software? This could be added into the main article section on reviews, perhaps saying something like what I wrote on my webpage "Basically, for correlation and simple regression, all gave the same results. Some also had the same results for forward stepwise." I only ask because I've searched the web and can't find any other review showing that the packages get the same results and so should be statistically equivalent. This seems like it would be useful information to have. Thanks Gene Shackman 03:45, 14 March 2009 (UTC)

call for comments

Hi Gene, I saw your note on the forum so just popped into check it out. Its an interesting article and certainly good information. I do not know anything about statistical software so cannot really comment on the topic at hand. Should there may be be a little more of a historical perspective? At present it reads like a "current" review that might become dated quite rapidly.

With regard to the following paragraph:

"Before using any statistical packages, it is generally a good idea to have a solid background in Statistics. Then the packages can be used to the best advantage, for example, to choose the most appropriate test, to make sure all the necessary assumptions are met, so that the appropriate conclusions can be drawn."

While this makes sense it is too general to be useful for a reader. Shouldn't there be specific examples of why some packages are better than others for any particular test? My first question would be "what makes these programs different?". Also, "why are some better than others for a particular test?"

Chris Day 14:48, 22 March 2009 (UTC)

Thanks for the comments. About "historical" - well, while some of the links may get to be outdated here and there, I think much of what is currently in the article is fairly stable. The packages mentioned are ones that have been around for a number of years. The reviews, articles and websites with listings and reviews are also in sources that are fairly stable, for example, journals, newsletters. Most of the tutorials, email lists and faq pages have also been around for a number of years. Should I add something about that, that most of these things have been around for some time?
about specific examples, and why some package may be better than others in some procedure... That certainly would be useful and I'm getting to that part, haven't written it yet.
and this leads to one question that I proposed above. As far as I know, there is only one source comparing the outcomes of the different programs, to show whether they are similarly accurate and what their output looks like, and the source is, well, mine. I wrote several reviews, here, But this is at a site I own, so am I allowed to refer to it? There is the one other paper about the accuracy of easyreg and I do mention that, but mine is the only comparison that I could find, and I've searched. Could I refer to it, or would someone else be kind enough to add something about it to this article?
Thanks for the feedback and I'm certainly open to any other suggestions. Gene Shackman 15:12, 22 March 2009 (UTC)
With regard to historical, I was not thinking about the stability of the links or content but more about the history of these programs. Why were they developed? Who developed them? That sort of thing. Chris Day 15:45, 22 March 2009 (UTC)
I'll look into that, see if I can add something. Gene Shackman 17:14, 22 March 2009 (UTC)
Good start. Now, I may very well be downloading some software, but I know enough statistics to know what I don't know. I suspect the history and examples of applicability will go together — the fact that someone built the software presumably indicates that they saw a need. Especially when that need is perceived in an application area (e.g., epidemiology, agricultural yield), rather than "have statistical freeware for the statistics class", the example becomes more concrete.
I have no answer to the problem of both distrust and overconfidence in statistical methods in the general population. A friend of mine is constantly complaining how political polls can't show anything with a small sample, and he will also complain about biased sensational media polls. I'm not sure if his distrust is more due to not knowing anything about sampling, or about things that are not statistical per se but poll design (e.g., Guttman vs. Likert vs. Mokkert scaling, randomization of question order to avoid suggestion, etc.) Howard C. Berkowitz 18:16, 22 March 2009 (UTC)
I think that a discussion of the issue about distrust and overconfidence in statistical methods should really go in the statistics page, or a sampling page, if there is one. On this page, about software, maybe just a mention that software makes it easy to do analysis, but you need to know what you are doing, so see statistics. Gene Shackman 20:38, 22 March 2009 (UTC)

what the devil is an NGO?

This is used all through the article -- Larry doesn't like abbreviations, especially ones that we (read "I") don't understand. Please change it, explain it, or remove it. Thanks! Hayford Peirce 15:28, 22 March 2009 (UTC)

I added the full name in the first paragraph - non governmental organization, like UNESCO. Gene Shackman 17:18, 22 March 2009 (UTC)
Thanks! I've never hoid it used before, but I suppose it's common enough. (Just did a little checking, it really does need a hyphen....) Hayford Peirce 17:32, 22 March 2009 (UTC)

Thanks for changing Non Govern.... to non-govern.... Gene Shackman 20:35, 22 March 2009 (UTC)

Non-governmental organization added. Howard C. Berkowitz 04:40, 2 April 2009 (UTC)

I'm not altogether certain of the relevance to statistics, but we have an entry on the subject and a number of other articles on topics mentioned so I've added some links. Roger Lohmann 15:45, 14 May 2009 (UTC)


Just out of interest, why is this in the Sociology Workgroup? Chris Day 15:42, 22 March 2009 (UTC)

I suppose it's in sociology because I'm a sociologist, and sociology uses a lot of stat analysis. But I'm certainly happy to see other work groups added, psychology, some science ones, anything anyone is interested in. I just can't remember how at the moment. Gene Shackman 17:18, 22 March 2009 (UTC)
All you have to do is go to the Metadata template page ( and add their names to the Categories list, of which there are three. Anyone can do it, although I suppose that this method can be open to dispute, which it *occasionally* has been (as when Martin Cohen started adding Philosophy to a lot of articles so that he could then be an editor), so it should be done with intelligence and discretion. Hayford Peirce 17:38, 22 March 2009 (UTC)

I'd say it is a bad precedent to add the sociology workgroup because it uses statistics. That is true for tens of our workgroups and we can only add three. One exception would be if this is a subject of sociology, possibly like the freeware movement could be looked at from a sociological perspective. But I don't think this article is going in that direction? Chris Day 18:26, 22 March 2009 (UTC)

Hmm, I'm on the sociology and psychology work groups, but I don't think I'm on others. Um, how would this work then? Do I need to get put on some work group that this article would be appropriate for? Math? Computers? How would I get added to those groups? Do I need to be on a work group to write an article in that group? Gene Shackman 18:53, 22 March 2009 (UTC)

No, you can write about what ever you like. The only thing that you can't do is approve it. Chris Day 18:56, 22 March 2009 (UTC)

Added the computer workgroup. Gene Shackman 19:00, 22 March 2009 (UTC)

Review of outputs and results

As far as I know, there is only one source comparing the outcomes of the different programs, to show whether they are similarly accurate and what their output looks like, and the source is, well, mine. I wrote several reviews, here, But this is at a site I own, so am I allowed to refer to it? There is the one other paper about the accuracy of easyreg and I do mention that, but mine is the only comparison that I could find, and I've searched. Could I refer to it, or would someone else be kind enough to add something about it to this article? Thanks Gene Shackman 01:44, 23 March 2009 (UTC)

As a Computers Editor, I'd observe the person that wrote software or techniques often can best explain its capabilities — although someone else may be the best at breaking it. Showing your own work really depends on how objectively it's done; to ban that at all defeats the idea of expertise. I'd say go ahead and put something into the article, keeping self-promotion in mind. There are other cases where people have written about freeware, etc., where they've had a role, but the writeup seemed appropriate. Howard C. Berkowitz 02:18, 23 March 2009 (UTC)

Response to feedback

1. Chris wrote "history of these programs. Why were they developed? Who developed them?" I added some information about this. Like to see more?

2. Hayford did some editing. Thanks!

3. Howard said "I'd say go ahead and put something into the article, keeping self-promotion in mind." I put in some information from my review. Could folks let me know whether the info I put in seems appropriate? Make sure it isn't a self promoting problem? Gene Shackman 04:38, 23 March 2009 (UTC)

If there is more it would be useful. I think a lot of this gets assumed as common knowledge whereas to people like me it is all new. Chris Day 04:42, 23 March 2009 (UTC)
"If there is more" Just wanted to check which "more" you meant. More history of the programs? More of the information from the review on my website? Thanks Gene Shackman 11:33, 24 March 2009 (UTC)

Looking good

I've started investigating, and the first thing I tried had me muttering non-family-friendly things: Epiinfo says it's XP SP3 dependent. Do you know if they are serious? SP3 doesn't run on HP machines. Not a big thing if you don't happen to know, but why on earth would they make something so restrictive? Howard C. Berkowitz 01:17, 3 April 2009 (UTC)

Hey, here's something I know - HP has a patch for SP3! I had to run it on my machines and everything went smooth (though it was a pain)... of course, this has nothing to do with Free statistical software, which I know nothing about :( D. Matt Innis 01:21, 3 April 2009 (UTC)
Fascinating. It utterly trashed my Registry; I finally had to reformat and start from a base SP2, cursing as I reinstalled a few dozen applications. Howard C. Berkowitz 02:38, 3 April 2009 (UTC)
Me, too, the first time - though a system restore worked initially and I turned off auto updates after that, but then I had to upgrade some software and it required SP3, so I tried again and it did the same thing. This time it sent me to the windows update center which gave me the link to HP's fix. D. Matt Innis 02:57, 3 April 2009 (UTC)
Oops, didn't know that. See
Howard wrote "It utterly trashed my Registry". Was that the HP patch for SP3 or EpiInfo? Either way, sorry about that. I also found which was a solution to the problem.
I can add something about possible glitches to software, if I can find places on the web that describe those probs, like the epi info issue, if you think that would help? Gene Shackman 04:18, 3 April 2009 (UTC)
SP3 did the damage; no, it wasn't the Intel/AMD bug. I'm just puzzled why CDC says it cares about SP3 -- I suppose I might think some real-time or networking software might care about the service pack level, while many Windows apps run not just on XP, but a variety of releases. Someone at CDC saying "we run it on SP3 and haven't tried anything else?"
No, unless it's a long-term instability, or perhaps no longer maintained software, I wouldn't bother trying to keep track of glitches; that was just a surprising restriction. Howard C. Berkowitz 04:30, 3 April 2009 (UTC)

signed article

Hi all

I just found this CZ article about signed articles

Can I make this article about free statistical software into a signed article? Signed articles look like a great way to get people to contribute, so they can get 'credit', and get something put on their resumes or CVs. I was thinking of suggesting something like this but it's great that it's already established/

thanks Gene Gene Shackman 16:49, 5 April 2009 (UTC)

As I understand the current policies, it should be nominated, preferably and perhaps necessarily by an Editor, but I'd certainly be willing to consider that under the Computers Workgroup. Howard C. Berkowitz 16:59, 5 April 2009 (UTC)
That would be great! Could the signature also say something like "with edits by Hayford (and others as appropriate), and suggestions by Howard (and others as appropriate)"? Gene Shackman 00:05, 6 April 2009 (UTC)
LOL! I don't think we've gotten that far in the procedures; the assumption had been a single signature. Interestingly, however, you are describing something very much like one of the suggestions made in the Forum for ways in which people could cite their contributions here for CV purposes, etc. Howard C. Berkowitz 00:36, 6 April 2009 (UTC)

what info would be useful

Okay, now I'd like to ask for some feedback again, on what sort of information would be useful to people about these packages? What to add next to this article?

thanks Gene Shackman 00:46, 13 April 2009 (UTC)

Pretty close to nomination

I have what are mostly lower-case "e" editorial observations.

Under "Brief history of free statistical software"

  • In some cases, the statistical software packages were developed for the purposes of making key technologies available to those who could not otherwise afford them, to empower development[11],[12] First, what were the technologies? Second, could you name the packages rather than having the information in a footnote?
  • A couple of packages don't appear to give any statements about why they were developed, other than just general use for statistical analysis[9],[5],[7]. Again, name the packages
  • EasyReg is listed or used in these papers[18],[19],[20], various versions of EpiInfo were used in these papers[21],[22],[23], R was used in these papers[24], [25], [26] and WinIdams was used in these papers[27], [28]. Lots of citations; I tend to dislike multiple citations in a row. Can you synthesize the reasoning for versions, etc.? Is it any more than those were the versions available at the time, and only the package name is really important?

Under "Using free statistical software"

  • I'd still like to see an article, however, brief, on "statistical software". Some of the points about understanding statistics are equally true for free and commercial.
  • Again personal preference, but I prefer one footnote per point, not several in a row. Perhaps several related citations could be under one footnote, although I prefer the most important.

Howard C. Berkowitz 18:00, 18 May 2009 (UTC)

I did most of the cases of naming the software (instead of "in some cases"). I also got most of the "footnotes in a row" and put bunches together into single footnotes. Also fixed the 'different versions' of epi-info. But I had a question. In this set of references "OpenStat and Instat were developed as teaching aids[8],[3]" one of the citations refers back to an earlier citation. How do I include that in one footnote, like this -
8 - (spelling out the citation)
3 - refer back to a previous citation.
Not sure what to do with this - I'd still like to see an article, however, brief, on "statistical software". Does this mean I should rewrite the current article and move parts to a different article, or should I eventually create another complete article, which refers to this one?

Gene Shackman 00:10, 21 May 2009 (UTC)

Approval and the Kops

I just glanced at the History of this article. It looks to me as if, aside from Gene, I'm the only person who ever contributed to it. But I'm pretty certain that all I did was a couple of very minor copyedits such as changing a "which" to "that". I don't think that my contributions are enough to prohibit me from doing the actual Approval. But what do you think? Hayford Peirce 19:07, 24 May 2009 (UTC)

Hayford - do you mean the second sentence "In general, free statistical software gives results which are the same as the results from commercial programs, and many of the packages are fairly easy to learn, using menu systems, although a few are command-driven.?" The "which" in this second sentence should be preceded by a comma, or be a "that." --Ruth Ifcher 02:25, 27 May 2009 (UTC)
Yes, that's the one I meant. Didn't the "that" take? I'll take another look at it. Hayford Peirce 03:17, 27 May 2009 (UTC)
As the nominating Editor, I've been very careful not to edit the article proper. I really don't see much conflict of interest by a Constable if there is an Editor nomination; Caesar's wife was in a much more resource-rich environment. Howard C. Berkowitz 19:42, 24 May 2009 (UTC)
I think the Kops' instruction page about Approvals has strictures about Constables approving pages that they've done more than cursory work on. But I'm pretty sure that there's no problem with this one. Hayford Peirce 19:56, 24 May 2009 (UTC)
Joe Quick left me a message asking whether I would support the nomination. I'm travelling so I don't have enough time to examine the article in sufficient depth. However, I did read through it and I don't see anything wrong, so go ahead. -- Jitse Niesen 07:39, 30 May 2009 (UTC)

"which" to "that"

Thanks, Ruth. I *know* that I did this before, since this is one of my pet hobbyhorses. I must have forgotten to Save my edit, however. Or maybe that weird bug that crops up occasionally wiped it out. But now it's done! Hayford Peirce 03:22, 27 May 2009 (UTC)

Be sure to also mark the article as being written in American English. (Isn't "American English" an oxymoron?)
Using that in this context is very much incorrect in English English, and is one of the things which trips me up most when reading American writings.
Caesar Schinas 06:26, 27 May 2009 (UTC)
I have, and use, the Second Edition of Fowler's Modern English Usage, and he would disagree with you. I realize that like all rules, it's an arbitrary one decided by human beings at some point, and then imposed, but its guidelines are quite clear. British writers, however, even the best of them, such as Evelyn Waugh simply choose to ignore them. Unfortunately, as a Constable, I haven't been able to find any CZ guidelines, or rules, that allow me to ban offenders who use which in place of that. Hayford Peirce 14:11, 30 May 2009 (UTC)
I do not, unfortunately, have a copy of Fowler's Modern English Usage, but I looked it up on Google Books and found several page about which. Unfortunately, at least one of them was hidden, but in the visible pages I found several usages of which where I believe Americans would use that in its place. I found no recommendation to use that instead of which, but as I say, one page was hidden.
Regardless, I do feel that one man's advice 85 years ago can hardly be taken to have more relevance than the usage employed by the writers of the entire nation. I know of no educated Brit who would consider that to be correct in place of which, though it's one of those Americanisms which are gradually making their way into spoken English.
Personally, I can live with either word when reading others' work, but I'll never use that myself...
Caesar Schinas 15:00, 30 May 2009 (UTC)

APPROVED Version 1.0


This article isn't praiseworthy for all the usual reasons we come to praise Caesar, but it adds something new, which I haven't completely articulated, rather to my frustration. Let me try: it is neutral, yet presents expert knowledge of something that really isn't documented anywhere. Synthesis? Something else? Whatever it is, we want to do more. Howard C. Berkowitz 16:44, 30 May 2009 (UTC)

Something wrong with the approval process -- HELP NEEDED!!!!

For some reason I can't make the Workgroups show up -- they're both in the metadata template but then they don't appear on the *real* pages. What the @#$%^&* is going on?! Hayford Peirce 17:38, 30 May 2009 (UTC)

Well, Howard's put in a template, but that doesn't seem to be helping. Hayford Peirce 18:17, 30 May 2009 (UTC)
Hmmm, now the template's gone. Hayford Peirce 18:18, 30 May 2009 (UTC)
Adding the {{Approval template here is where you went wrong. It goes on the Approval page only. D. Matt Innis 03:19, 31 May 2009 (UTC)
Okie, thanks! I'll study this *carefully* tomorrow morning when I ain't drinkin' a Manhattan and will try to figure out what I did wrong and how I can rewrite the instructions again. I *still* don't understand, however, why sometimes the {{Approval and }} show up on blank screens when I need them and sometimes they don't. And why sometimes {{subpages}} shows up when needed and sometimes it doesn't. Hayford Peirce 04:09, 31 May 2009 (UTC)

Now things seem to be OK!

What did Howard do, and why? Those are the questions that must be answered! Hayford Peirce 18:20, 30 May 2009 (UTC)

Have unprotected the talk page -- the instructions are *very* unclear on this concept

The instructions are gonna have to be rewritten. Hayford Peirce 18:22, 30 May 2009 (UTC)

Approved article

Thanks for approving this article. I'd like to ask a couple of questions.

1. Howard wrote "Whatever it is, we want to do more."

Um, I hope that means you like the article. Would you like me to do some more articles in this style, neutral but expert knowledge?

2. I guess this article is approved. Now can I make changes, like if I find new information or new software? Would that go on the draft page?

thanks gene

Gene Shackman 04:53, 31 May 2009 (UTC)

Gene, once an article is approved, then it is "protected" (i.e., locked) so that edits cannot be made to the main article page of the approved version. Any further edits/revisions/additions can only be made to the draft article. When the draft has had significant edits/revisions/additions, then it may be nominated for "re-approval" by the same methods used for "approvals". Milton Beychok 06:49, 31 May 2009 (UTC)
Okay, thanks for the info. Gene Shackman 13:52, 31 May 2009 (UTC)
One further amendment to that, as was discussed somewhere else a week or so ago: if you find a *small* thing to change in the article, such as a typo, or a mis-used word, or a bibliographical reference in the wrong place, just ask your friendly neighborhood Kop to change it. You can contact the Constables directly, or me, or put something on the Talk page of the article in question, and it will be taken care of. Hayford Peirce 16:26, 31 May 2009 (UTC)

(undent, weird edit conflict deleting text)

As far as #1, yes, I like it. Some time ago, Russell Jones commented that the way I had brought together seemingly dissimilar issues in Wars of Vietnam constituted what he saw as good about CZ, which might be considered permissible "original synthesis". In this case, that particular structure isn't my invention, but the idea holds: there is an opportunity for neutral, expert presentation of things that have non-obvious relationships, such as software packages not only that have a similar function, but are free. I'd love to have some guidance on freeware graphics packages, especially those that can be use for computer-assisted design, as opposed, say, to photographic manipulation. Is there a package, I wonder, that uses an interface that would make sense to a skilled darkroom technician, rather than to someone more of a graphics art printer? Howard C. Berkowitz 19:47, 31 May 2009 (UTC)


Is the "each other" correct? Most of the major commercial packages do many of the same statistical procedures as each other and as many of the free statistical packages. --Paul Wormer 16:42, 31 May 2009 (UTC)

I'm not sure, but at any rate it's pretty clumsy... What about something like this :
Most of the major commercial and free packages have many statistical procedures in common.
Caesar Schinas 16:48, 31 May 2009 (UTC)
Caesar got in before me. I had written:It's certainly very awkward, and the "as many of the free etc." is meaningless. Or non-comprehensible. This entire sentence should be rewritten for flow, clarity, and pleasure to the eye. Hayford Peirce 16:50, 31 May 2009 (UTC)
Yes, that would be okay, as long as that's what he's actually trying to say. Hayford Peirce 16:51, 31 May 2009 (UTC)
Well that's how I interpret the original sentence - actually, I'm not sure how else it could be interpreted! Caesar Schinas 17:18, 31 May 2009 (UTC)
"Most of the major commercial and free packages have many statistical procedures in common." That sounds good to me. That is what I meant. Gene Shackman 17:24, 31 May 2009 (UTC)
Okay, does everyone agree that I, a Kop, should make this change, and that it is minor enough that I am allowed to do so? Paul? Other Editors? Hayford Peirce 17:35, 31 May 2009 (UTC)
Go ahead, --Paul Wormer 10:17, 1 June 2009 (UTC)

correcting a footnote

Would someone be able to save the 'draft' version to the final version? I had one citation to the wrong source, and have corrected it.

In the "Reviews of free statistical software" section, I updated citation #29.


Gene Gene Shackman 05:12, 22 June 2009 (UTC)

Fine with me. Howard C. Berkowitz 20:23, 14 December 2009 (UTC), Computers Workgroup Editor
Thanks Howard, I'll take care of that. D. Matt Innis 20:49, 14 December 2009 (UTC)
Got it. D. Matt Innis 20:51, 14 December 2009 (UTC)