When Large Language Model (LLM) analyzes 130,000 documents it means it is getting the information, thus uses that information. If you call it learning or absorbing, whatever, it took the information, it did not invent new information.
That is inaccurate. Machine learning does produce information not explicitly included in the learning set.
You mentioned "new original insight in the data" -- that is matter of viewpoint and consciousness of the observer.
No, it isn't. The model data includes data generated by analyzing the learning set, and that fits the definition of "new insight".
Who is right?
The disagreement between Louis and Eli revolves around the interpretation of what constitutes "new information" or "insight" generated by a machine learning model.
Louis's Position: Louis argues that the model does not produce new information. Instead, it processes and analyzes existing data. According to Louis, what might be perceived as new insights are simply reorganizations or interpretations of the information already present in the input data, depending on the observer's perspective.
Eli's Position: Eli contends that machine learning models can generate insights that are not explicitly present in the learning set. This includes patterns, correlations, or predictions that emerge from the model's analysis, which can be considered new information that was not directly provided in the training data.
Resolution: Both positions have valid points:
From Louis's Perspective:
From Eli's Perspective:
In essence, whether machine learning produces "new information" can depend on how one defines newness and insight. From a strict data-centric view, all information is derived from the input data. However, from a functional or application-oriented perspective, the insights and predictions generated by the model can be seen as new information, as they provide value and understanding beyond the explicit content of the training set.