 |
| Syntactical similarity - sub project |
Dec 2008 |
I have created a small project (used also in this system) called syntactical similarity.
It's open source, completely independent from jane16 system and it's on google's code base for
free download.
Here is the summary from project home page.
syntactical-similarity
Compute syntactical similarity of the text.
Some texts are too similar to each other, like almost! duplicated news articles for example.
The difference could be that in the middle of the text is different advertisement or just headline is slightly modified.
This simple program tries to compute how much (in percentage) are two texts similar.
Note: This is syntactical similarity, not lexical one.
It means that only structure of words and phrases is taken into account not their meaning.
|
|
| Full sentiment database release as open source |
Nov 2008 |
|
Due to a huge demand I have decided to release full version of sentiment database as open source, also
some significant changes in loading the database, I have changed file format from XML to CSv which decrease the file size ( and performance alike) a lot. So now there is available full copy of this online sentiment engine for local dwonload from
sourceforge.net
|
|
| RSS feed with related search engine results |
Sep/Oct 2008 |
I have created RSS feed enhanced with related search engine results. The example page is available
here. The idea is quite simple. Read RSS feed, analyse the individual article content,
get related search engine results ( using Yahoo BOSS) and put everything together.
I also added quite a few new enhancement in database engine, mostly performance based.
|
|
| New site released - version 2.0 |
Aug 2008 |
|
New version of www.jane16.com site and engine is released.
Many enhancement on the site, plus it's been added new part of engine, called sentiment analysis
capable to extract from opinionated text positive or negative mood.
|
|
| Sentiment database as open source on sourceforge.net |
Aug 2008 |
|
On the www.sourceforge.net there is created new project called jane16sentiment, which
open source version of this system , with little smaller database in place.
Available for download here.
|
|
| More database training |
July 2008 |
|
Secondary sentiment engine databases is being trained while the primary one is trained even more as well.
Secondary database is to support primary one and also is to be used in light version of the sentiment system.
(It actually turns out that primary database is not trained enough and , from previous moth statement, and it needs to
be given more data to learn.)
|
|
| Sentiment databases almost ready |
Jun 2008 |
|
Sentiment databases are almost ready, that means the sample size is
big enough to be used in real production. |
|
| Sentiment engine - start of the project |
Mar 2008 |
|
Sentiment engine databases as well as engine supporting them are started being trained and developed respectively.
The idea lies behind basic statistical mapping of the real data from the Internet and it's usage throughout the
jane16 engine.
|
|
| Jane16 - full rewrite of the database |
Jan 2008 |
|
Full rewrite of the database, including defragmentation, rectification and
some middle level bugs fixing. This is necessary as to have the system ready or next stage of the development.
Mainly defragmentation speed up the performance a lot
|
|
| Jane16 - Added clustering section to the site |
Nov 2007 |
|
Added clustering section for already found search results. It is capable to cluster to group the search results based
on criteria analysed from jane16 engine.
|
|
| www.jane16.com version 1.0 |
Sep 2007 |
|
www.jane16.com is release as free online text analyser, it supports, keywords extraction.
scope analysis and summary extraction from the text provided
|
|
| www.jane16.com start fo development |
Mar 2007 |
|
First version of jane16 is being developed and system is trained with first real data.
|
|
| jane16 engine idea has been created |
2005-2006 |
|
Basic core of the database is being developed. This core supports only 16 bytes unit size , and that;s why
the number 16 comes from in the name of the system.
|
|
| jane16 engine idea has been created |
2004 |
|
Around this year the idea of this system is started and slowly creep to the light of the world.
Somehow slowly at the beginnings, with many question how to do certain things.
|
|
|
|
|