Sunday, July 21, 2019

Topological Unsort

To implement program with multiple languages, the languages must communicate. C is an interface language; most other languages have a C interface. We organize data in C programs with struct and union. To communicate between C and other languages data is converted to and from struct/union. Threads and processes send messages, each of which is a different struct/union. Most languages are more powerful than C, so they can restrict portions of their data organization to conform to C struct/union. When there are many different struct/union types to send, it can become arduous to change the corresponding data types in each language. Preferably, a script would automatically generate the struct/union types in each language. Each field of a struct/union can be a pointer to one or more other contiguous struct/union instances, an array, a basic type, an enumeration type, or a string. Each field can have validity dependent on other fields, called tags. Think of the tags as dimensions; tag values are a tag space. Each field is valid in a subset of the tag space. C is limited in that without an unwieldy number of identifiers, struct/union can not have a unique representation for every point in the tag space. Instead, consecutive fields should have similar tag space subsets. My script to generate struct/union types in several languages, takes as input lists of fields, each with a tag space subset. Consecutive fields or unions with equal tag space subsets collect into structs. Consecutive fields or structs with disjoint tag space subsets collect into unions. A tag space subset opens a union. A tag space superset closes structs and unions. A topological sort of the graph is the original list of fields.

No comments: