How it is possible to perform textual and statistical analysis of web contents by IBM SPSS software?


SPSS is IBM’s own tool for text and statistical analyzing. With the help of a customizable work environment, we can carry out in-depth analyzes, and deliver results through tables and various visualizations.

But how can we display additional content from the Web in IBM SPSS?



Using TAS Data Collector, a service developed by Precognox, it is possible to use further contents avaliable on the Internet within IBM SPSS.



We show this process with a step-by-step guide.


1. Selecting and downloading a particular web page (as a data source)


In the first step, we download the selected web data (in our example, the content of the Keresővilág Blog website) using the TAS Data Collector service. However, in the case of unstructured data (text content) on the web, downloading is accompanied by a number of tasks (data cleaning, validation) that are accomplished by our data specialists. As a result of these workflows, a structured database is created, which will be continuously updated in the future, so that the data is always available and usable.
The user gets access (server data, user name and password) to the structured database through a secured, password-protected channel.


The following steps are required in the IBM SPSS interface:
2. Choose File – Import Data – Database – New Query


3. Click Add ODBC Data Source


4. Click Add


5. Select the MySQL-type connection, then click Finish


6. After entering the Connection Parameters you received from us earlier, click the OK button


7. After selecting the Data Source, click the Next button


8. Select the Table and configure the Fields order then click Finish


9. Choose Graphs to visualize your data


Example of completed visualization


In addition to providing insight into the content of the web-based data source (chosen website), the visualization is a great business asset as it can be a part of presentations, business reports, evaluations, or even competitive analyzes. In this way, the potential of huge amounts of data on the Internet can be exploited.

Are you also an IBM SPSS user? Do you want to know more about our text analytics solutions?
Contact us!

Péter Hódi
+36 20/416-74-79

Read more about TAS Data Collector on the TAS Text Analytics System page.

Structured databases provided by Data Collector – thanks to integration – can be visualized with most known business intelligence tools (Tableau, RapidMiner, Power BI, Google Data Studio).


Pictures: IBM SPSS interface and visualization

If you liked the article please share it with others!