OpenCyc – Is this the Knowledgebase for the Next Generation Chatbots?


The most annoying thing with chatbots is that you can spoof even the best trained ones within a short time. This is caused by a shortage in knowledge they can use to answer your questions.

Smalltalk From time to time, especially when I think about new ideas for the avatar creation, Google and I also find links about chatbots. Chatbots are pretty interesting. They have to present themselves as avatars, the more humanoid the better. They also have to parse natural language and find correct answers, that are generated via computer voices. Some of those technologies are commodity, some maybe need years of exploration, because they are part of the Artificial Intelligence movement. Finally, there’s a lot of interesting stuff in chatbots.

My own tests have shown that it is a pretty time-consuming thing to train a chatbot so that it can be offered to the public. So, I skipped the idea some years ago when I heard of them the first time. Even today there’s not much improvement.

In the past I even read about a project that tried to let a chatbot learn like a baby. The baby should be about 5 years now. Sure, a nice idea to create an artifical human. But, as long as no doctor can describe the learning mechanics of our brain as they really work, this can only be an experiment like the design of a neural network. The simplification can help to solve simpler problems but the construct itself is too simple to solve what it original was designed for.

So, the best solution is to have a knowledgebase that has so many entries that no human can find an area in it where the missing knowledge allows to spoof the chatbot. If we think of Wikipedia this could be a first step. If it offers a Web Service interface we could program chatbots to ask wikipedia to answer. The keywords and links between topics are a good base for creating suitable answers. Maybe a bit of Google could deliver some more stuff that is not well written in Wikipedia.

Indeed there is a project that works a bit like this: OpenCyc. The idea behind it is the delivery of free semantic content. That could be the knowledgebase for a smart chatbot that knows much without a long inital training phase. Best of all we have Apache 2.0 license and Java support with OpenCyc. But, at the moment internal updates and non opened changes prevent from real testing.

Nevertheless, if you are on Twitter, you can ask questions and get answers. This is a bit like talking to a chatbot ;-) .

blog comments powered by Disqus