Talk:Entropy of a probability distribution

From Citizendium
Revision as of 21:56, 4 July 2007 by imported>Robert Tito (→‎question)
Jump to navigation Jump to search


Article Checklist for "Entropy of a probability distribution"
Workgroup category or categories Mathematics Workgroup, Computers Workgroup [Editors asked to check categories]
Article status Stub: no more than a few sentences
Underlinked article? Yes
Basic cleanup done? Yes
Checklist last edited by Greg Woodhouse 11:20, 27 June 2007 (CDT), Ragnar Schroder 10:58, 27 June 2007 (CDT)

To learn how to fill out this checklist, please see CZ:The Article Checklist.





Fixed by me! --Robert W King 11:04, 27 June 2007 (CDT)

question

H is the science symbol for entthalpy, S the symbol for entropy. I think this H should in reality be an S, as that is the way it is in my books for prob. distribs. Unless it changed within the last year I think it still remains the S. I can even understand using A, F or G but not H, according to the universal definition of entropy it is unitless. Robert Tito |  Talk  19:58, 4 July 2007 (CDT)

Interesting. I hadn't thought of that. Traditionally, H is used for the entropy function in coding theory. I'm not sure about the history behind that, but now that I think back to my physics classes, you are right about S being entropy. Greg Woodhouse 20:38, 4 July 2007 (CDT)
To put is further: A is the free energy os a system, F the free enthalpy, and G the Gibbs energy (but that is solely used in thermodynamics). What struck me as odd is the unit, bit is no unit I know of, only a quantity of information and for that reason energy/entropy. And as far as I teach statistics I use S as entropy but then I use it in statistical chemistry/pfysics - and I prefer consistency in units and symbols. Robert Tito |  Talk  21:00, 4 July 2007 (CDT)
Well, entropy is used in coding theory as a measure of information content. A channel is defined as a potentially infinite sequence of symbols from a fixed alphabet. We assume there is a probability distribution telling us how likely each symbol is to occur. The entropy is then (unsurprisingly):


where the logarithm is taken to a base of 2. (If the logarithms are taken to base e, the units of entropy are sometimes called "nats".) Intuitively, this is just a measure of how much the bit stream can be compressed due to redundancy. The logarithm to base 2 of 1/pi is the number of bits that are needed to encode a symbol that occurs with probability pi. People care about this, among other things, because it gives you a framework for calculating the data rate you can hope to get from a channel with a given bandwidth, assuming you use the best possible encoding algorithm (this is the famous Shannon limit). I completely sympathize with your dislike of the inconsistency here. Greg Woodhouse 21:34, 4 July 2007 (CDT)
Shannon in his 1946 and 1948 papers defined the perfect enigma as the enigma that has 0 as Shannon information left in the encoded result. In this the analogy with entropy differs as entropy always seeks to maximize to reach the lowest energy (sometimes called) equilibrium state. Robert Tito |  Talk 

that is the entropy S :)

the definitions are identical, hence I wonder if H is the actual symbol. Robert Tito |  Talk