Title :
Conservation of Information: Software’sHidden Clockwork?
Author_Institution :
Fac. of Sci., Eng. & Comput., Kingston Univ., Kingston upon Thames, UK
Abstract :
In this paper it is proposed that the Conservation of Hartley-Shannon Information (hereafter contracted to H-S Information) plays the same role in discrete systems as the Conservation of Energy does in physical systems. In particular, using a variational approach, it is shown that the symmetry of scale-invariance, power-laws and the Conservation of H-S Information are intimately related and lead to the prediction that the component sizes of any software system assembled from components made from discrete tokens always asymptote to a scale-free power-law distribution in the unique alphabet of tokens used to construct each component. This is then validated to a very high degree of significance on some 100 million lines of software in seven different programming languages independently of how the software was produced, what it does, who produced it or what stage of maturity it has reached. A further implication of the theory presented here is that the average size of components depends only on their unique alphabet, independently of the package they appear in. This too is demonstrated on the main data set and also on 24 additional Fortran 90 packages.
Keywords :
information theory; programming languages; software engineering; Fortran; H-S information conservation; Hartley-Shannon information conservation; discrete systems; discrete tokens; energy conservation; physical systems; scale-free power-law distribution; scale-invariance symmetry; software hidden clockwork; software system; variational approach; Bioinformatics; Computer languages; Genetic communication; Genomics; Software systems; Information conservation; component size distribution; power-law; software systems;
Journal_Title :
Software Engineering, IEEE Transactions on
DOI :
10.1109/TSE.2014.2316158