95% of systems problems arise from data not being in its authoritative source.

Duplicating data is what causes you to have non-authoritative sources of data. Whenever you duplicate data (and code is data), you create a scenario where people can be looking at the non-authoritative source of data.

Some duplication of data is inherently wrong. In a badly designed system, you might cache data that doesn't need to be cached, or you might persist calculated fields without persisting the data that the fields were calculated from.

Sometimes, though, data duplication occurs due to legitimate technical constraints. For example, the authoritative source for data might be in Seattle, but if your computer is in Boston, you might want to download the data to Boston.

As networks get better, though, duplicating data makes less sense. The cost of network lag is far outweighed by the benefit of knowing that your always getting data from its authoritative source.

I think the genius of WebizingPython is that it removes any legitimate language-driven need to duplicate data. If your language can express the concept of getting data from across the continent just as easily as getting the data from the disk drive in the next room, then you can really focus on problemsolving, and then network "gets" just become a performance optimization.

AuthoritativeSourceOfData (last edited 2008-03-04 08:33:24 by localhost)