Steven,
I did not say that everyone in the computer science trade “thought that ‘data science’ is the term for the computer science associated with information
management“, but rather “everyone in the (data science) field”. The field produced a mathematical theory for manipulating information, analytical methods for converting a problem
space to a formal model, and completeness theories for data organizations and search strategies. The guardians of the ‘computer science’ term just failed to realize that those guys who were building the electronic business equivalent of cathedrals were also
developing the equivalent of a formal theory for mechanics. So it had a different name.
And OBTW, the reporter must say that something is ‘new’; otherwise it isn’t “news”, and he doesn’t have a story.
You wrote:
>
Just as "computer science" is really "computer engineering," I suspect that data science is no kind of science but is really data engineering.
I strongly disagree. I will grant you that people misuse the term ‘computer science’ to cover a lot of engineering activities. And I will grant you that the
sense of the difference that I have is hard to describe. In making standards in this trade, however, I have encountered enough senior software engineers who were not computer scientists to recognize a difference.
The difference is in being educated in the theory, and understanding the theory, in addition to, on in lieu of, being knowledgeable about the practice. The
senior software engineers know enough about the theory to recognize that they are applying it, or at least to call what they are doing an application of the theory. But particularly in the data sciences, in computational linguistics, and in a number of automated
reasoning applications, the difference between the expert engineer and the computer scientist can be quite apparent. The telltale line is “I can make this work”, as opposed to “I can prove that it will work”.
A really simple example: In 1991, in formulating the OMG interface language IDL, there was a standardization battle over whether a function could take/return
an operand whose type was “list of <type>” vs. “array of <type>”. All the engineers wanted “array of type”, because arrays have a fixed size or an upper bound, and they knew how to implement that. The two computer scientists, from DEC and Xerox, wanted “list
of type”, because the idea was to define a functional model that would be mapped to various engineering implementations. The computer scientists understood what a “functional model” is; the engineers saw “method signature”. This is a very low level issue,
and yet, the difference in the kind of thinking involved is apparent.
Like mathematics, real “computer science” involves a mental discipline that is tied up with abstraction and inference, and somehow the scientific notion “variable”
– the things you cannot codify. Engineering is about devising an application of technology to implement a function. The interaction point comes in engineering analysis – how you know in what range of situations a given design will work. Engineers need to
know enough about the theory to understand and use the analytical methods. But it often takes specialized knowledge to make the analytical models, and that is where the understanding of the variables and the understanding of the theory comes in. Stellar
engineers can do both.
Another example: In solving some optimization problem, we ended up with a cost function that produces a vector and a partial ordering for the vectors that
is used to prune the search trees. To some of the analysts involved, the idea that the cost function would not just produce a real number meant that the fundamental algorithm we were applying could not possibly be applied. They understood the search algorithm
well enough to apply it, but not well enough to generalize it.
I think the distinction I make may be more of the “I know the difference when I see it” kind. It is just that I see it frequently.
-Ed
From: ontolog-forum-bounces@xxxxxxxxxxxxxxxx [mailto:ontolog-forum-bounces@xxxxxxxxxxxxxxxx]
On Behalf Of Steven Ericsson-Zenith
Sent: Friday, August 29, 2014 9:36 PM
To: [ontolog-forum]
Subject: Re: [ontolog-forum] Looking to the Future of Data Science - NYTimes.com - 2014.08.27
On Fri, Aug 29, 2014 at 10:23 AM, John F Sowa <sowa@xxxxxxxxxxx> wrote:
NY Times
> The Association for Computing Machinery, a leading professional
> association in computer science, is this week holding its annual
> conference focused on what we're now calling data science - though
> the ACM still clings to the label adopted when the yearly gatherings
> began in 1998, Knowledge Discovery and Data Mining.
JFS
> That's the fundamental principle for creating the illusion of progress:
> Change the name of the field every decade.
EJB
> I don't think that is fair. Anyone I know or knew in the field has long
> thought that "data science" is the term for the computer science associated
> with information management.
If everyone knew that, why did they need another name? For that matter,
note that the NYT reporter thought that 'data science' was a new name.
Just as "computer science" is really "computer engineering," I suspect that data science is no kind of science but is really data engineering.