What is TAS Tagger?
TAS Tagger is a text analytics solution that can extract and define key terms and topics from textual content. These terms and named entities (eg, personal names, places, organizations, dates) are identified by computer linguistic and machine learning methods and tools.
The types of textual content to be tagged can be different: web based texts (articles and other documents), scientific content (essays, dissertations, research publications), business documents (contracts, notes) or even emails.
Currently, tagging is available in Hungarian and English, but of course other language versions are also possible.
Why is TAS Tagger useful?
TAS Tagger provides various advantages. Tagging bigger text bodies is improving the usage efficiency of the documents:
- enriching its data (tags are metadata)
- making them more easily searchable (documentations or even emails)
- improving its data quality
In addition, the TAS Tagger solution can provide data for automatic (machine learning based) classification of texts.
The tagging process
- definition of the text body to be tagged
- specification of tags
- controlling of how precise the tags are
TAS Tagger analyses the text body and define tags automatically or the set of possible tags can be defined by the customer in advance. In these cases we build a professional tag-database in partnership with the user. This database contains the pre-defined tags. The machine learning model uses this database and could be re-trained every time the tag-database changes. This re-training method can be accomplished by the user through the TAS user interface. The tagging process is also trackable on the same GUI. Once a tag is accepted, the software stores it. The system also stores the previously tagged text contents.
The more connections and relations are defined, the more specific tagging results are going to be available. Therefore, it is always important to build the tag database carefully.
TAS Tagger UI
The Tagger GUI can be created within the confines of TAS Platform (TAS Cloud service) or On Premise (locally installed). The appearance of Tagger is consistent with the corporate identity of TAS Platform. The visualization and the other parts of the user interface are also configurable. The particular solution depends on the customer’s needs.
Technical description (requirements, integration, open source software used)
Initial Resource Requirement (On Premise – For Onsite Installation):
x86_64 CPU with at least 4 cores
at least 16GB RAM
35GB hard drive (storage may increase in some cases)
64-bit Linux, Windows, or macOS – 64-bit JDK 1.8 or higher
Accessibility and platform support for developers
Cloud API – On Premise API – Java SDK available