Education scientist have embraced the change of focus form a front loaded teaching format to a focus on the learning process of pupils or students. Computer scientists have adopted this strategy in shifting from just knowledge data bases and predictions of likely next words in a sentence or paragraph to learning models. Deep Seek has surprised most large language models by its successful strategy to focus on learning and reasoning. So-called reinforcement learning is key to the programming of next generation AI models. Reasoning in most cases builds on multiple step sequences in answering a more complex question. The model then returns the answer and the steps (reasoning) applied. There is a debate whether summaries or translations of texts would need the reasoning function of AI models. Most of the time reasoning might not be necessary or even counterproductive, if the translation would try to correct an obviously faulty reasoning in a text.
Maybe, imagine also that an ordinary LLM would translate a text containing fake news. A correction loop which involves a cross-checking with reliable external sources like any encyclopedia or wikipedia would complicate the answering procedure of any text. However, this is a bit like, how the process of reinforcement learning with human feedback (RLHF) works. Reinforcement learning applies a form of accuracy reward, which guides the learning or answering process with checks against mathematical or programming accuracy. Just think of basic logic to be respected in the answer.
Similarly, a formal accuracy control checks against mathematical models and ensures the answer is returning a text with a normal sentence structure or numbering of reasoning steps, an intro and concluding phrase, like we were all asked to do in school or universities. The amount of corrections from humans is reduced quite a lot and the computing resources are also only a fraction of the previous LLMs, which are retrieving answers from enormous databases or gigantic data factories consuming lots of energy in the processing of requests. Remember the movie on Kasparov, the world chess champion, who got beaten by a computer from IBM that did not only have a huge stock of previous games and tournaments, but could make judgments on positions and promising strategies to pursue. Don’t be surprised if a DeepSeek answer is superior to what our own mind and reasoning is capable of. Reinforcement learning is a learning tool, which we also may apply, if we deem it appropriate or just as one way of coming to an answer. (useful reference: Sebastian Raschka, Building a LLM from scratch, Manning).
(Image, ChatGPT, 2 humanoid robots are thinking and discussing how to repair a notebook which is sitting on a workbench). 










The flowering season starts earlier in Europe and bees start earlier ro their collection of nectar and their service of pollination to other flowers. In early April 2025 in France near Paris we observe wild bees already in their daily routine. However, the risk of cold nights is still there, albeit those building their homes below the surface are a bit less at risk during a frosty night. Seeking a clever shelter is a good strategy for survival particularly at times of global warming. Some kinds of wild bees seem to sense this already changing homes from one season to next one. Humans remain their toughest enemies as they restrict their choices quite severely. Man-made pollution and herbicides are beyond bees’ control and cause havoc in the ecosystem of bees. Apiculture is an interesting science also for social scientists as this forerunner species of the matriarchy has evolved into a well-organized productive society. They are a bit harsh to each other and communication is rather unidirectional, but an interesting social cosmos of its own kind.























