Thursday, July 16, 2009

Inheritance in human and software languages

The English language and software language are dynamic. They are ever changing. That is why they are worth studying if one wishes to continually use them.

I enjoyed English grammar and word morphology when I was elementary age. (I also enjoyed technology too.) One of the most mind-boggling things about word morphology is how certain words -- even jargon in the computer science world--are borrowed from other languages.

A set of words that capture my interest are the bases of numbers, namely 'binary', 'ternary', 'quatenary', 'senary', 'octal', 'decimal', 'vigensimal', 'quadrovigensimal', 'hexatridecimal', 'sexagensimal', and 'hexadecimal'. These words are derived from both Latin and/or Greek with Latin being the primary language. 'Binary' comes from Latin 'bini', which means 'two-pair; two-fold; two a piece'. 'Ternary' comes from 'terni', which means 'three a piece'. 'Quatenary' comes from 'quaterni', which means 'four a piece'. 'Senary' comes from Latin 'seni', which means 'six a piece'. 'Octal' comes from both Latin and Greek and Latin numeric of 8, namely 'octo' and maybe also the Latin ordinal of 8th: 'octaval'. 'Decimal' comes from the Latin ordinal for 10th: 'decimus, -a, -um'. 'hexadecimal' comes from both Latin 'decimal' and Greek numeric for 6 'hex'. Apparently somebody must have forgotten the Latin or Greek ordinals or must have not liked the ordinal sound 'sextadecimal'. Also, base 24 'quadrovigensimal' strangely has derived from the latin for 'square' and for 'twentieth'. And base 36 'hexatridecimal' strangely does not even use Latin for 'thirtieth', namely 'trigensimal', but derives from the Greek for 'six' and for 'thrice' and it inherited from the Latin for 'tenth'. Hence, 'hexatridecimal' means 'six and thrice the tenth [base]' English words have so many derivations that spelling English words becomes difficult sometimes. If someone asked me about what the bases of numbers should be named, I would probably favor the Latin ordinals the most and the Greek ordinals the second most. Below is a chart of what the bases of numbers would be if Greek or Latin is the language:

Base Base Name from
Greek Ordinal
Base Name from
Latin Ordinal
Base Name from
Greek Numeric
2deuteralsecundaldual
3tritaltertialtreisal
4tetartalquartaltesharal
5pemptalquintalpental
6hectalsextalhexal
7hebdomalseptimalheptal
8ogdoaloctavaloctal
9enatalnonalenneal
10decataldecimaldecal
11hendecatalundecimalhendecal
12dodecatalduodecimaldodecal
13decata-tritaltertiadecimaldecatresal
14decata-tetartalquartadecimaldecatesharal
15decata-pemptalquintadecimaldecapental
16decata-hectalsextadecimaldeca-hexal
17decata-hebdomalseptimadecimaldeca-heptal
18decata-ogdolduodevicensimaldeca-octoal
19decata-enatalundevicensimaldeca-enneal
20eikostalvicensimal*eikosial
21prota-eikostalprimavicensimalhen-eikosial
22deutera-eikostalsecundavicensimalduo-eikosial
...
30triakostaltricensimaltriakontal
...
36hectatriakostalsextatricensimalhexatriakontal
...
60hexekostalsexagensimalhexekontal
...
64tetartahexekostalquartasexagensimalteshara-hexekontal

* Another way of writing the base twenty is 'vigensimal'. I just chose
'vicensimal' because that is the most common way of writing the ordinal of
twenty in Latin.


If one asked why English is so oblique with derivations, the English linguists will probably say, "People got creative with their language." Donald Knuth, a computer scientist who is famed for his writings, once said that the etymologically correct word for base 16 numbers is 'senidenary' according to the 'wikipedia' on hexadecimal, which is a combined form of 'seni' (or 'six a piece') and 'deni' (or 'ten a piece'). Language develops with the people. That is apparent here. Likewise, software languages develop with people too.


Unlike the English language, C++ inherits from one language instead of multiple languages. An example of how far C++ has come can be portrayed in the following two pieces of example code of a program that prints out the command line arguments:


#include <stdlib.h>
#include <stdio.h>
#include <iostream>

/**
* stringtest.cpp
* Author: Conrad K
*/

int main(int argc, char * argv[]) {
     int strlength = 0;
     for (int i = 1; i < argc; i++)
         strlength += strlen(argv[i]);
     char * test_string = new char[argc + strlength];
     for (int i = 1; i < argc; i++) {
        strcat(test_string, argv[i]);
        if (i < argc - 1)
           strcat(test_string, " ");
     }
     int test_length = strlen (test_string); 
     int sample_length = strlen("this is a string");
     printf("String acquired: %s \n", test_string);
     printf("Length of string: %d \n", test_length);

     if (test_length > sample_length) {
        printf ("The string that you entered is longer than the string \n");
        printf ("'this is a string'.\n");
     }
     else {
        printf ("The string that you entered is shorter than the string \n");
        printf("'this is a string'.\n");
     }
     return 0;
}

The above sample is a C++ program that uses standard C libraries. I might also
be able to rename it to a file with a '.c' extension and compile it with the C
compiler with C++ extensions.


#include <string>
#include <iostream> 

using namespace std;

/**
* stringtester.cpp
* Author Conrad K
*/

int main(int argc, char * argv[]) {
     string tester("");
     for (int i = 1; i < argc; i++) {
         tester.append(argv[i]);
         if (i < argc - 1)
            tester.append(" ");
     }
     string sample("this is a string");
     int test_length = tester.length();
     int sample_length = sample.length();
     cout << "String acquired: " << tester << endl;
     cout << "Length of string: " << test_length << endl;

     if (test_length > sample_length) {
        cout << "The string that you entered is longer than the string ";
        cout << endl << "'this is a string'." << endl;
     }
     else {
        cout << "The string that you entered is shorter than the string ";
        cout << endl << "'this is a string'." << endl;
     }
     return 0;
} 

The code above uses object-oriented C++ as opposed to the former code when it uses the object 'string'. (A 'string' is defined as a set of characters.) With the standard C++ library one is able to inherit a rich set of functions that are safe in memory. One can expect that some C functions, such as 'strtok', are not safe on the thread's memory heap. Also, the latter code would print better on Windows because Windows needs carriage returns (\r) before the newlines (\n) in order to print a line feed. However, the former code compiles into a smaller executable than the latter code (about two kilobytes smaller). Interesting evolution of the C++ libraries occurred. I remember using C++ before C++ standard library existed. About a decade ago, the C++ standard libraries existed without a namespace. But now it exists with a namespace. Yet C++ still derives from C completely. It does not derive really from Perl, Lisp, Smalltalk, or Prolog.

Other languages inherit from C and C++ in various ways. Java does uses all of the atomic types that are found in C and uses syntax based on the C language. However, it does not allow one to allocate memory directly because it allocates memory for a person. C# inherits from C++ in even more ways because it allows memory allocation if one marks the code unsafe. This is one thing that is strange about C#. I also notice that atomic type 'string' oddly exists in the C-sharp language. C++ and Java don't have atomic types for 'string', and I don't think of a 'string' being an atomic type. A 'string' does not have a discrete length. This whole blog can be defined as a 'string'. Therefore, a string would exist in memory of a computer and not in a register of a computer. Even if we were to define a string as a set of alphanumeric characters, you can get a string of many different lengths. (I heard the longest word in English is about 100 characters long.) Even though I think of a 'string' not being an atomic type, Microsoft and many .NET developers think of 'string' being an atomic. Microsoft knows that people use a 'string' in programming so often. They added an atomic type 'string' into the C-sharp programming language. One can check the length of the 'string' using the property 'length' as well as get the text.

Future languages will probably be much different than today's languages. I wonder if the atomic type 'blob' might appear in the languages of the near future because people work with images a lot. I am also expecting that video processing methods and functions to become part of standard API of programming languages. Likewise, I am expecting that colloquialism and slang of English today will become standard English in the near future. Likewise, what will become of our numbering system if people inherit a dozen fingers from us. A kid inherited polydactyly from his father according to the ktvu website. That kid is even able to wiggle a dozen toes and work with a dozen fingers! Wow! If the future descendents of today's people acquired genetic mutations from their parents and inherited a dozen fingers and toes, would it be a shame to confuse them with the decimal numbering system? We inherited the numbering system from the Arabics. So, if our descendents will count with more than ten digits, why not inherit new digits from the Arabics for every digit that our descendents inherit from us?

No comments:

Post a Comment