Business analyst data analysis approaches. Data Analysis Objectives

Each big business and most middle-sized structures face the problem of providing management with inaccurate data on the state of affairs of the company. The reasons may be different, but the consequences are always the same - wrong or untimely decisions that negatively affect the effectiveness of financial transactions. To exclude such situations, a professional business analytics system or BI ( from English - Business Intelligence). These high-tech "assistants" contribute to the construction of a system of management control of every aspect of the business.

At its core, BI systems are advanced analytical software for business analysis and reporting. These programs can use data from various sources information and provide them in a convenient form and cut. As a result, management gets quick access to complete and transparent information about the state of affairs of the company. A feature of reports obtained with the help of BI is the ability of the manager to independently choose in which context to receive information.


Modern Business Intelligence systems are multifunctional. That is why in large companies they are gradually replacing other methods of obtaining business reporting. Experts refer to their main capabilities:

  • Connections to various databases, in particular, to;
  • Formation of reports of varying complexity, structure, type and layout at high speed. It is also possible to set a schedule for generating reports on a schedule without direct participation and distribution of data;
  • Transparent work with data;
  • Providing a clear connection between information from various sources;
  • Flexible and intuitive setting of employee access rights in the system;
  • Saving data in any format convenient for you - PDF, Excel, HTML and many others.

Opportunities information systems Business analysts allow a leader to be independent of the IT department or their assistants to provide the required information. It is also a great opportunity to demonstrate the correct direction of your decisions, not in words, but in precise numbers. Many large network corporations in the West have been using BI systems for a long time, including the world famous Amazon, Yahoo, Wall-Mart and others. The above corporations spend a lot of money on business intelligence, but the implemented BI systems bring invaluable benefits.

The benefits of professional business intelligence systems are based on the principles that are supported in all advanced BI applications:

  1. Visibility. The main interface of any business analysis software should reflect the main indicators. Thanks to this, the manager will quickly be able to assess the state of affairs in the enterprise and begin to take something if necessary;
  2. Customization. Each user should be able to customize the interface and function keys in the most convenient way for themselves;
  3. Layering. Each dataset should have several sections (layers) to provide the level of detail that is needed at a particular level;
  4. Interactivity. Users should be able to collect information from all sources and from several directions at the same time. It is necessary that the system has the function of configuring the notification by key parameters;
  5. Multithreading and access control. In the BI system, the simultaneous operation of a large number of users should be implemented with the ability to set them different levels of access.

The entire IT community agrees that business intelligence information systems are one of the most promising areas of industry development. However, their implementation is often hampered by technical and psychological barriers, uncoordinated work of managers and the absence of prescribed areas of responsibility.

When thinking about the implementation of BI-class systems, it is important to remember that the success of the project will largely depend on the attitude of the company's employees to innovation. This applies to all IT products: skepticism and fear of downsizing can undermine all implementation efforts. Therefore, it is very important to understand what feelings the business intelligence system evokes in future users. The ideal situation will arise when the company's employees treat the system as an assistant and a tool for improving their work.

Before starting a project for the implementation of BI technology, it is necessary to conduct a thorough analysis of the company's business processes and the principles of making management decisions. After all, it is these data that will participate in the analysis of the situation in the company. It will also help to make the choice of a BI system along with other main criteria:

  1. Goals and objectives of implementing BI systems;
  2. Requirements for storing data and the ability to operate with them;
  3. Data integration functions. Without using data from all sources in the company, management will not be able to get a holistic picture of the state of affairs;
  4. Visualization capabilities. For each person, the ideal BI analytics looks different, and the system must meet the needs of each user;
  5. Versatility or narrow specialization. There are systems in the world aimed at a specific industry, as well as universal solutions that allow you to collect information in any aspect;
  6. Demanding resources and the price of a software product. The choice of a BI system, like any software, depends on the capabilities of the company.

The above criteria will help the management make an informed choice among all the variety of well-known business intelligence systems. There are other parameters (eg storage structure, web architecture), but these require expertise in narrow IT areas.

It's not enough just to make a choice, buy software, install and configure it. Successful implementation of BI systems in any direction is based on the following rules:

  • Correctness of data. If the data for the analysis is incorrect, then there is a possibility of a serious system error;
  • Comprehensive training for each user;
  • Fast implementation. You need to focus on getting the right reports right at all key locations, rather than serving a single user perfectly. Adjust appearance report or add another section of it for convenience, you can always after implementation;
  • Realize the ROI on your BI system. The effect depends on many factors and in some cases is visible only after a few months;
  • The equipment should be designed not only for the current situation, but also for the near future;
  • Understand why the implementation of the BI system was started, and do not demand from software impossible.


According to statistics, only 30% of company executives are satisfied with the implementation of BI systems. Over the years of the existence of business analysis software, experts have formulated 9 key mistakes that can reduce efficiency to a minimum:

  1. Non-obviousness of the purpose of implementation for management. Often, a project is created by the IT department without the close involvement of managers. In most cases, in the process of implementation and operation, questions arise about the purpose and objectives of the BI system, the benefits and usability;
  2. Lack of transparency in management, work of employees and decision-making. Managers may not know the algorithms for the work of field employees, and management decisions can be accepted not only on the basis of dry facts. This will lead to the impossibility of maintaining the existing paradigm as a result of the implementation of the BI system. And often break the culture that has developed over the years corporate governance impossible;
  3. Insufficient data reliability. Falling false information into the business analysis system is unacceptable, otherwise employees will not be able to trust it and use it;
  4. The wrong choice of a professional business intelligence system. Many examples in history, when management hires a third-party organization to implement a BI system and does not take part in its choice, speak for themselves. As a result, a system is introduced that does not allow obtaining the required report or with which it is impossible to integrate one of the existing software in the company;
  5. Lack of a plan for the future. The peculiarity of BI systems is that it is not static software. It is impossible to finish an implementation project and not think about it. There are many requirements from users and management regarding improvements;
  6. Transfer of the BI system to a third-party organization for support. As practice shows, most often such situations lead to product isolation and isolation of the system from the real state of affairs. Own support service responds much faster and more efficiently to user feedback and management requirements;
  7. Desire to save money. In business, this is normal, but BI analytics only works if it takes into account all aspects of the company's activities. This is why high-value deep analytics systems are most effective. The desire to receive several reports on areas of interest leads to frequent data errors and a large dependence on the qualifications of IT specialists;
  8. Different terminology in the company. It is important that all users understand the basic terms and their meaning. A simple misunderstanding can lead to misinterpretation of the reports and indicators of the BI system;
  9. Lack of a unified business analysis strategy at the enterprise. Without a single course chosen for all employees, any BI class system will be just a set of disparate reports that satisfy the requirements of individual managers.

The implementation of BI systems is an important step that can help bring your business to the next level. But this will require not only a fairly large infusion of finance, but also the time and effort of each employee of the company. Not every business is ready to competently complete a project for implementing a business analysis system.


On the analysis of information in Lately they say so much and so much that one can finally get confused in the problem. It is good that many are paying attention to such an urgent topic. The only bad thing is that by this term everyone understands what he needs, often without having a general picture of the problem. The fragmentation in this approach is the reason for the misunderstanding of what is happening and what to do. Everything consists of pieces that are weakly connected to each other and do not have a common core. You've probably heard the phrase "patchwork automation" a lot. Many have already encountered this problem many times and can confirm that the main problem with this approach is that it is almost never possible to see the whole picture. The situation is similar with the analysis.

In order to understand the place and purpose of each analysis mechanism, let's consider all this in full. It will be based on how a person makes decisions, since we are not able to explain how a thought is born, we will concentrate on how it is possible to use Information Technology... The first option - the decision maker (DM) uses the computer only as a means of extracting data, and draws conclusions on his own. To solve this kind of problems, reporting systems, multidimensional data analysis, charts and other visualization methods are used. The second option: the program not only extracts data, but also performs various kinds of preprocessing, for example, cleaning, smoothing, and so on. And to the data processed in this way, it applies mathematical methods of analysis - clustering, classification, regression, etc. In this case, the decision maker receives not raw data, but seriously processed data, i.e. a person is already working with models prepared by a computer.

Due to the fact that in the first case almost everything that is actually connected with decision-making mechanisms is assigned to a person, the problem with the selection of an adequate model and the choice of processing methods is moved outside the analysis mechanisms, i.e., the basis for making a decision is either an instruction (for example how you can implement mechanisms for responding to deviations), or intuition. In some cases, this is quite enough, but if decision makers are interested in knowledge that is deep enough, so to speak, then data extraction mechanisms simply will not help here. More serious processing is required. This is the very second case. All applied preprocessing and analysis mechanisms allow decision makers to work at a higher level. The first option is suitable for solving tactical and operational tasks, and the second is for replicating knowledge and solving strategic problems.

The ideal case would be to apply both approaches to analysis. They allow you to cover almost all the needs of an organization in the analysis of business information. By varying the methodology depending on the tasks, we will be able to squeeze the maximum out of the available information in any case.

The general scheme of work is shown below.

Often, when describing a product that analyzes business information, terms such as risk management, forecasting, market segmentation are used ... But in reality, the solutions to each of these problems are reduced to the use of one of the analysis methods described below. For example, forecasting is a regression problem, market segmentation is clustering, risk management is a combination of clustering and classification, other methods are also possible. Therefore, this set of technologies allows you to solve most business problems. In fact, they are atomic (basic) elements from which the solution of a particular problem is assembled.

Now we will describe separately each fragment of the circuit.

The primary source of data should be databases of enterprise management systems, office documents, the Internet, because it is necessary to use all the information that may be useful for making a decision. Moreover, we are talking not only about internal information for the organization, but also about external data (macroeconomic indicators, competitive environment, demographic data, etc.).

Although the data warehouse does not implement analysis technologies, it is the basis on which to build an analytical system. In the absence of a data warehouse, collecting and organizing the information necessary for analysis will take most of the time, which will largely negate all the advantages of analysis. After all, one of key indicators any analytical system is the ability to quickly get results.

The next element of the schema is the semantic layer. Regardless of how the information will be analyzed, it is necessary for it to be understood by the decision maker, since in most cases the analyzed data are located in different databases, and the decision maker should not delve into the nuances of working with the DBMS, then it is required to create a mechanism that transforms the terms domain in calls to database access mechanisms. This task is performed by the semantic layer. It is desirable that it be the same for all analysis applications, so it is easier to apply different approaches to the problem.

Reporting systems are designed to answer the question "what's going on". The first use case: regular reports are used to monitor the operational situation and analyze deviations. For example, the system prepares daily reports on the stock of products in the warehouse, and when its value is less than the average weekly sale, you need to respond to this by preparing a purchase order, that is, in most cases, these are standardized business transactions. Most often, some elements of this approach have been implemented in one form or another in companies (even if just on paper), but this should not be allowed to be the only available approach to data analysis. A second use case for reporting systems: handling ad hoc requests. When a decision maker wants to test any thought (hypothesis), he needs to get food for thought, confirming or refuting the idea, since these thoughts come spontaneously, and there is no exact idea of ​​what kind of information will be required, a tool is needed that allows you to quickly and get this information in a convenient form. The extracted data is usually presented in either tables or graphs and charts, although other representations are possible.

Although different approaches can be used to build reporting systems, the most common today is the OLAP mechanism. The main idea is to represent information in the form of multidimensional cubes, where axes represent dimensions (for example, time, products, customers), and measures (for example, sales amount, average purchase price) are placed in cells. The user manipulates measurements and receives information in the desired section.

Due to its ease of understanding, OLAP has become widespread as a data analysis engine, but you need to understand that its capabilities in the field of deeper analysis, for example, forecasting, are extremely limited. The main problem in solving forecasting problems is not the ability to extract the data of interest in the form of tables and diagrams, but the construction of an adequate model. Then everything is quite simple. New information is fed to the input of the existing model, passed through it, and the result is the forecast. But building a model is not a trivial task at all. Of course, you can put several ready-made and simple models into the system, for example, linear regression or something similar, quite often this is what they do, but this does not solve the problem. Real-world problems almost always go beyond such simple models. Consequently, such a model will only detect explicit dependencies, the detection value of which is insignificant, which is well known anyway, or it will make too rough predictions, which is also completely uninteresting. For example, if, when analyzing the price of shares in the stock market, you proceed from the simple assumption that tomorrow the shares will cost the same as today, then in 90% of cases you will guess. And how valuable is such knowledge? Only the remaining 10% are of interest to brokers. In most cases, primitive models give results of about the same level.

The correct approach to building models is to improve them step by step. Starting with the first, relatively rough model, it is necessary to improve it as new data accumulates and the model is applied in practice. Actually, the task of making forecasts and similar things go beyond the mechanisms of reporting systems, so you should not expect positive results in this direction when using OLAP. To solve the problems of deeper analysis, a completely different set of technologies is used, united under the name Knowledge Discovery in Databases.

Knowledge Discovery in Databases (KDD) is the process of transforming data into knowledge. KDD includes issues of data preparation, selection of informative features, data cleansing, application of Data Mining (DM) methods, post-processing of data, interpretation of the results. Data Mining is the process of discovering previously unknown, non-trivial, practically useful and accessible knowledge for interpretation in "raw" data, which is necessary for making decisions in various spheres of human activity.

The attractiveness of this approach lies in the fact that regardless of the subject area, we apply the same operations:

  1. Extract data. In our case, this requires a semantic layer.
  2. Clear data. The use of "dirty" data for analysis can completely negate the analysis mechanisms used in the future.
  3. Transform data. Various analysis methods require specially prepared data. For example, somewhere only digital information can be used as inputs.
  4. To carry out, in fact, an analysis - Data Mining.
  5. Interpret the results obtained.

This process is repeated iteratively.

Data Mining, in turn, provides a solution to only 6 tasks - classification, clustering, regression, association, sequence and variance analysis.

This is all that needs to be done to automate the knowledge extraction process. Further steps are already being taken by an expert, who is also a decision maker.

The interpretation of the results of computer processing is the responsibility of the person. It's just that different methods provide different food for thought. In the simplest case, these are tables and charts, and in the more complex case, models and rules. It is impossible to completely exclude human participation, since this or that result has no meaning until it is applied to a specific subject area. However, there is an opportunity to replicate knowledge. For example, a decision maker, using some method, determined which indicators affect the creditworthiness of buyers, and presented this in the form of a rule. The rule can be introduced into the system of issuing loans and thus significantly reduce credit risks by putting their assessments on stream. At the same time, a deep understanding of the reasons for this or that conclusion is not required from the person involved in the actual extraction of documents. In fact, this is a transfer of methods, once applied in industry, into the field of knowledge management. The main idea is to move from one-time and non-unified methods to pipelined ones.

Everything mentioned above is just the names of the tasks. And to solve each of them, you can apply various techniques, ranging from the classic statistical methods and ending with self-learning algorithms. Real business problems are almost always solved by one of the above methods or their combination. Almost all tasks - forecasting, market segmentation, risk assessment, performance assessment advertising campaigns, grade competitive advantages and many others - come down to those described above. Therefore, having at your disposal a tool that solves the given list of tasks, we can say that you are ready to solve any problem of business analysis.

If you paid attention, we have not mentioned anywhere about which tool will be used for analysis, which technologies, because the tasks themselves and methods for their solution do not depend on the toolkit. This is just a description of a competent approach to the problem. You can use whatever you want, it is only important that the entire list of tasks is covered. In this case, we can say that there is a truly full-featured solution. Very often, mechanisms that cover only an insignificant part of the tasks are proposed as a "full-featured solution to business analysis problems." Most often, a system for analyzing business information is understood only as OLAP, which is completely insufficient for a full-fledged analysis. Under a thick layer of advertising slogans there is only a reporting system. Effective descriptions of one or another analysis tool hide the essence, but it is enough to start from the proposed scheme, and you will understand the real state of affairs.

Accessible work with Big Data using visual analytics

Improve business intelligence and solve routine tasks using information hidden in Big Data with the TIBCO Spotfire platform. It is the only platform that provides business users with an intuitive, easy-to-use user interface that enables the full range of analytic technologies for Big Data without the need for IT professionals or training.

The Spotfire interface makes it equally convenient to work with both small datasets and multi-terabyte clusters of big data: sensor readings, information from social networks, points of sale or geolocation sources. Users of all skill levels can easily navigate meaningful dashboards and analytic workflows simply by using visualizations that graphically represent the aggregation of billions of data points.

Predictive analytics is learning on the job based on the company's shared experience to make more informed decisions. Using Spotfire Predictive Analytics, you can discover new market trends from business intelligence insights and take action to minimize risk, leading to better management decisions.

Overview

Big Data Connectivity for High-Performance Analytics

Spotfire offers three main types of analytics with seamless integration with Hadoop and other large data sources:

  1. On-Demand Analytics Data Visualization: Built-in, user-configurable data connectors that facilitate ultra-fast, interactive data visualization
  2. Analysis in a database (In-Database Analytics): integration with a distribution computing platform that allows you to make calculations of data of any complexity based on big data.
  3. In-Memory Analytics: Integration with a statistical analysis platform that pulls data directly from any data source, including traditional and new data sources.

Together, these integration methods represent a powerful combination of visual exploration and advanced analytics.
It enables business users to access, aggregate, and analyze data from any data source through powerful, easy-to-use dashboards and workflows.

Big data connectors

Spotfire big data connectors support all kinds of data access: in-datasource, in-memory and on-demand. Spotfire's built-in data connectors include:

  • Hadoop Certified Data Connectors for Apache Hive, Apache Spark SQL, Cloudera Hive, Cloudera Impala, Databricks Cloud, Hortonworks, MapR Drill, and Pivotal HAWQ
  • Other certified big data connectors include Teradata, Teradata Aster, and Netezza
  • Connectors for historical and current data from sources such as OSI PI sensors

In-Datasource Distributed Computing

In addition to Spotfire's convenient visual selection of operations for SQL queries that access data distributed across sources, Spotfire can create statistical and machine learning algorithms that operate inside data sources and return only the results needed to create visualizations in Spotfire.

  • Users work with dashboards with visual selection functionality that access scripts using the built-in capabilities of the TERR language,
  • TERR scripts initiate distributed computing functionality in interaction with Map / Reduce, H2O, SparkR, or Fuzzy Logix,
  • These applications in turn access highly efficient systems like Hadoop or other data sources,
  • TERR can be deployed as an advanced analytics engine in Hadoop nodes that are driven by MapReduce or Spark. TERR can also be used for Teradata data nodes.
  • The results are visualized on Spotfire.

TERR for advanced analytics

TIBCO Enterprise Runtime for R (TERR) - TERR is a statistical package corporate level, which was developed by TIBCO for full compatibility with the R language, realizing the company's many years of experience in the analytical system associated with S +. This allows customers to continue to develop applications and models not only using open source R, but also integrate and deploy their R code on a commercial, reliable platform without having to rewrite their code. TERR has higher efficiency and reliable memory management, provides higher data processing speed on large volumes compared to the open source R language.

Combining all the functionality

Combining the aforementioned powerful functionality means that even for the most complex tasks requiring highly reliable analytics, users interact with simple, easy-to-use interactive workflows. This allows business users to visualize and analyze data, as well as share analytics results, without the need to know the details of the data architecture underlying the business analysis.

Example: Spotfire interface for configuring, running and visualizing the results of a model that defines the characteristics of lost loads. Through this interface, business users can perform computations using TERR and H2O (a distributed computing framework) by accessing transaction and shipment data stored in Hadoop clusters.

Analytical space for big data


Advanced and predictive analytics

Users use Spotfire's visual selection dashboards to launch a rich set of advanced features that make it easy to make predictions, create models, and optimize them on the fly. Using big data, analysis can be done inside the data source (In-Datasource), returning only the aggregated information and results needed to create visualizations on the Spotfire platform.


Machine learning

A wide range of machine learning tools are available in Spotfire's list of built-in features that can be used with a single click. Statisticians have access to the program code written in the R language and can expand the functionality used. Machine learning functionality can be shared with other users for easy reuse.

The following machine learning methods are available for continuous categorical variables on Spotfire and on TERR:

  • Linear and logistic regression
  • Decision trees, Random forest, Gradient Boosting Machine (GBM)
  • Generalized linear (additive) models ( Generalized Additive Models)
  • Neural networks


Content analysis

Spotfire provides analytics and data visualization, a significant part of which was not used before - it is unstructured text that is stored in sources such as documents, reports, notes CRM systems, site logs, publications on social networks and much more.


Location analytics

Layered maps high resolution are a great way to visualize big data. Spotfire's rich map functionality allows you to create maps with as many reference and functional layers as you need. Spotfire also enables sophisticated analytics to be used while working with maps. In addition to geographic maps, the system creates maps to visualize the behavior of users, warehouses, production, raw materials and many other indicators.

Over the decades of working with large customers, Force has accumulated vast experience in the field of business analysis and is now actively developing big data technologies. Olga Gorchinskaya, Director for research projects and Head of Big Data "Force".

15.10.2015

Olga Gorchinskaya

Per last years the generation of leaders has changed. New people came to the management of companies, who made their careers already in the era of informatization, and they are used to using computers, the Internet and mobile devices how in Everyday life and for solving work problems.

CNews: To what extent are BI tools in demand? Russian companies? Are there any changes in the approach to business analysis: from "analytics in the style of Excel" to the use of analytical tools by top managers?

Olga Gorchinskaya:

Today, the need for business analysis tools is already quite high. They are used by large organizations in almost all sectors of the economy. Midsize and small businesses alike understand the benefits of moving from Excel to dedicated analytics solutions.

If we compare this situation with the one that was in the companies even five years ago, we will see significant progress. In recent years, a generation of leaders has changed. New people came to the management of companies, who made their careers already in the era of informatization, and they are used to using computers, the Internet and mobile devices both in everyday life and for solving work problems.

CNews: But there are no more projects?

Olga Gorchinskaya:

Recently, we have noted a slight decrease in the number of new large BI projects. First, the complex general economic and political situation plays a role. It is holding back the start of some projects related to the introduction of Western systems. Interest in solutions based on free software also delays the start of BI projects, since it requires a preliminary study of this software segment. Many open source analytics solutions are not mature enough to be widely used.

Secondly, there has already been a certain saturation of the market. There are not many organizations nowadays that do not use business analysis. And, apparently, the time of active growth in the implementation of large corporate analytical systems is passing.

And, finally, it is important to note that now the customers are shifting their emphasis in the use of BI-tools, which is holding back the growth of the number of projects we are used to. The fact is that the leading vendors - Oracle, IBM, SAP - base their BI solutions on the idea of ​​a single consistent logical data model, which means that before analyzing something, it is necessary to clearly define and agree on all concepts and indicators.

Together with the obvious advantages, this leads to a great dependence of business users on IT specialists: if it is necessary to include some new data in the scope of consideration, the business has to constantly turn to IT to load data, harmonize it with existing structures, include it in the general model, etc. etc. Now we see that business wants more freedom, and for the sake of being able to add new structures on their own, interpret and analyze them at their own discretion, users are ready to sacrifice some part of corporate consistency.

So now the focus is on lightweight tools that allow end users to work directly with the data without worrying much about corporate consistency. As a result, we are seeing the successful advancement of Tableaux and Qlick, which allow it to work in the Data Discovery style, and some loss of market by large solution providers.

CNews: This explains why a number of organizations are implementing several BI systems - this is especially noticeable in the financial sector. But can such informatization be considered normal?


Olga Gorchinskaya

Today, tools that we previously thought were too lightweight for the enterprise level are taking the lead. These are solutions of the Data Discovery class.

Olga Gorchinskaya:

Indeed, in practice, large organizations often use not a single, but several independent analytical systems, each with its own BI tools. The idea of ​​a corporate-wide analytical model turned out to be a kind of utopia, it is not so popular and even limits the promotion of analytical technologies, since in practice every department, or even an individual user, wants independence and freedom. There is nothing wrong with that. After all, in the same bank, risk professionals and marketers need completely different BI tools. Therefore, it is quite normal when a company chooses not a cumbersome single solution for all tasks, but several small systems that are most suitable for individual departments.

Today, tools that we previously thought were too lightweight for the enterprise level are taking the lead. These are solutions of the Data Discovery class. They are based on the idea of ​​simplicity of working with data, speed, flexibility and an easy-to-understand presentation of the analysis results. There is another reason for the growing popularity of such tools: companies are increasingly experiencing the need to work with information of a changing structure, generally unstructured, with a "vague" meaning and not always clear value. In this case, more flexible tools are required than classic remedies business analysis.

"Force" has created the largest in Europe and unique in Russia platform - Fors Solution Center. Its main task is to bring the latest technology Oracle to the end customer, help partners in their development and application, make the equipment and software testing processes as accessible as possible. It is a kind of data center for partner testing systems and cloud solutions.

CNews: How Big Data Technologies Help Develop Business Intelligence?

Olga Gorchinskaya:

These areas - big data and business intelligence - are moving closer to each other and, in my opinion, the line between them is already blurred. For example, deep analytics is considered “big data,” even though it existed before Big Data. Now interest in machine learning, statistics is increasing, and with the help of these big data technologies, it is possible to expand the functionality of a traditional business system focused on computing and visualization.

In addition, the concept of data warehouses has been expanded by the use of Hadoop technology, which has led to new standards for building corporate storage in the form of data lakes.

CNews: What are the most promising tasks for which big data solutions are used?

Olga Gorchinskaya:

We use big data technologies in BI projects in several cases. The first is when it is necessary to improve the performance of an existing data warehouse, which is very important in an environment when companies are rapidly increasing the amount of information they use. Storing raw data in traditional relational databases is very expensive and requires more and more processing power. In such cases, it makes more sense to use Hadoop tooling, which is very efficient due to its very architecture, flexible, adaptable to specific needs and profitable from an economic point of view, since it is based on an Open Source solution.

With the help of Hadoop, we, in particular, solved the problem of storing and processing unstructured data in one large Russian bank. In this case, we were talking about large volumes of regularly arriving data with a changing structure. This information must be processed, disassembled, extracted from it numerical indicators, as well as to save the original data. Considering the significant increase in the volume of incoming information, using relational storage for this became too expensive and ineffective way. We have created a separate Hadoop cluster for processing primary documents, the results of which are loaded into the relational storage for analysis and further use.

The second direction is the introduction of in-depth analytics tools to expand the functionality of the BI-system. This is a very promising area, since it is associated not only with solving IT problems, but also with creating new business opportunities.

Instead of organizing special projects to implement in-depth analytics, we try to expand the scope of existing projects. For example, for almost any system useful function is the forecasting of indicators based on available historical data. This is not such an easy task, it requires not only skills in working with tools, but also a certain mathematical background, knowledge of statistics and econometrics.

Our company has a dedicated team of data analysts who meet these requirements. They carried out a project in the field of healthcare for the formation of regulatory reporting, and in addition, within the framework of this project, forecasting of the workload of medical organizations and their segmentation by statistical indicators was implemented. The value of such forecasts for the customer is clear, for him it is not just the use of some new exotic technology, but a completely natural expansion of analytical capabilities. As a result, interest in the development of the system is stimulated, and for us - new work. We are now implementing predictive analytics technologies in a project for city management in a similar way.

And finally, we have experience in implementing big data technologies where it comes to the use of unstructured data, primarily various text documents. The Internet offers great opportunities with its huge volumes of unstructured information containing useful information for business. We had a very interesting experience with the development of a real estate appraisal system for ROSEKO on request Russian society appraisers. To select analog objects, the system collected data from sources on the Internet, processed this information using linguistic technologies and enriched it using geo-analytics using machine learning methods.

CNews: What own solutions "Force" is developing in the areas of business intelligence and big data?

Olga Gorchinskaya:

We have developed and are developing a special solution in the field of big data - ForSMedia. It is a social media data analysis platform for enriching customer knowledge. It can be used in various industries: financial sector, telecom, retail - wherever they want to know as much as possible about their customers.


Olga Gorchinskaya

We have developed and are developing a special solution in the field of big data - ForSMedia. It is a social media data analysis platform for enriching customer knowledge.

A typical use case is developing targeted marketing campaigns. If the company has 20 million customers, distribute all advertisements on the base is unrealistic. It is necessary to narrow the circle of ad recipients, and the target function here is to increase customer response to marketing proposal... In this case, we can upload basic data about all customers (names, surnames, dates of birth, place of residence) into ForSMedia, and then, based on information from social networks, add new useful information to them, including circle of interests, social status, family composition, region professional activity, musical preferences, etc. Of course, such knowledge can not be found for all clients, since a certain part of them do not use social networks at all, but for targeted marketing such an “incomplete” result gives huge advantages.

Social networks- a very rich source, although it is difficult to work with it. It is not so easy to identify a person among users - people often use different forms of their names, do not indicate age, preferences, it is not easy to find out the characteristics of a user based on his posts, subscription groups.

The ForSMedia platform solves all these problems on the basis of big data technologies and allows enriching customer data and analyzing results on a massive scale. Technologies used include Hadoop, R statistical research framework, RCO linguistic processing tools, Data Discovery tools.

The ForSMedia platform makes the most of free distribution software and can be installed on any hardware platform that meets the requirements of a business task. But for large deployments and with increased performance requirements, we offer a special version optimized to work on Oracle hardware and software systems - Oracle Big Data Appliance and Oracle Exalytics.

The use of innovative integrated Oracle complexes in large projects is an important area of ​​our activity not only in the field of analytical systems. Such projects will turn out to be not cheap, but due to the scale of the tasks being solved, they fully justify themselves.

CNews: Can customers test these systems somehow before making a purchasing decision? Do you provide, for example, test benches?

Olga Gorchinskaya:

In this direction, we do not just provide test stands, but have created the largest in Europe and unique in Russia platform - Fors Solution Center. Its main task is to bring the latest Oracle technologies closer to the end customer, to help partners in their development and application, to make the equipment and software testing processes as accessible as possible. The idea did not arise out of nowhere. For almost 25 years, Force has been developing and implementing solutions based on Oracle technologies and platforms. We have extensive experience working with both clients and partners. In fact, Force is the Oracle competence center in Russia.

Based on this experience, in 2011, when the first versions of the Oracle Exadata database engine appeared, we created the first laboratory to master these systems, called it ExaStudio. On its basis, dozens of companies could discover the possibilities of new Exadata software and hardware solutions. Finally, in 2014, we turned it into a kind of data center for testing systems and cloud solutions - this is the Fors Solution Center.

Now our Center presents a full line of the latest Oracle hardware and software systems - from Exadata and Exalogic to the Big Data Appliance big data machine - which, in fact, act as test benches for our partners and customers. In addition to testing, here you can get services for auditing information systems, migrating to a new platform, setting up, configuring and scaling.

The center is actively developing in the direction of using cloud technologies. Not so long ago, the architecture of the Center was refined in such a way as to provide its computing resources and services in the cloud. Customers can now take advantage of the self-service performance capabilities of uploading test data, applications, and testing to the cloud.

As a result, a partner company or a customer can, without preliminary investments in equipment and pilot projects on their territory, upload their own applications to our cloud, test, compare performance results and make one or another decision on the transition to a new platform.

CNews: And the last question - what will you present at Oracle Day?

Olga Gorchinskaya:

Oracle Day is the main event of the year in Russia for the corporation and all its partners. "Force" has repeatedly been its general sponsor, and this year too. The forum will be entirely devoted to cloud topics - PaaS, SaaS, IaaS, and will be held as Oracle Cloud Day, since Oracle pays great attention to these technologies.

At the event, we will present our ForSMedia platform, as well as talk about the experience of using big data technologies and projects in the field of business intelligence. And, of course, we will tell you about the new capabilities of our Fors Solution Center in the field of building cloud solutions.

Small business in the CIS countries does not yet use data analysis for business development, determining correlations, searching for hidden patterns: entrepreneurs manage to get by with the reports of marketers and accountants. Small and semi-medium-sized business leaders rely more on their intuition than on analysis. But at the same time, analytics have huge potential: it helps to reduce costs and increase profits, make decisions faster and more objectively, optimize processes, better understand customers and improve the product.

An accountant is not a substitute for an analyst

Small business leaders often assume that the reports of marketers and accountants adequately reflect the activities of the company. But it is very difficult to make a decision on the basis of dry statistics, and an error in calculations without specialized education is inevitable.

Case 1. Post-analysis of promotional campaigns. For the New Year, the entrepreneur announced a promotion, within the framework of which certain goods were offered at a discount. After assessing the revenue for the New Year period, he saw the sales increase and was delighted with his resourcefulness. But let's take all the factors into account:

  • Sales grow especially strongly on Friday, the day when revenue is highest - this is a weekly trend.
  • Compared to the growth in sales that usually occurs under New Year, then the gain is not so great.
  • If we filter out promotional items, it turns out that the sales figures have deteriorated.

Case 2. Research of turnover. At the store women's clothing difficulties with logistics: the goods are in short supply in some warehouses, and in some they have been lying for months. How to determine, without analyzing sales, how many trousers to bring to one region, and how many coats to send to another, while getting the maximum profit? To do this, you need to calculate the turnover, the ratio of the speed of sales and the average inventory for a certain period. To put it simply, turnover is an indicator of how many days a store will take to sell a product, how quickly the average stock is sold, how quickly the product pays for itself. It is economically unprofitable to store large reserves, as it freezes capital and slows down development. If the stock is reduced, there may be a shortage and the company will again lose profit. Where can you find the golden mean, the ratio at which the product does not stagnate in the warehouse, and at the same time, you can give a certain guarantee that the customer will find the desired unit in the store? To do this, the analyst must help you determine:

  • desired turnover,
  • turnover dynamics.

When settling with suppliers with a deferral, it is also necessary to calculate the ratio of the credit line and turnover. Turnover in days = Average inventory * number of days / Turnover for this period.

Calculation of the remaining assortment and the total turnover by stores helps to understand where it is necessary to move a part of the product. It is also worth calculating what the turnover rate for each unit of the assortment is, in order to make a decision markdown with a reduced demand, additional order with an increased demand, moving to a different warehouse. By categories, you can develop a report on turnover in this form. It can be seen that T-shirts and jumpers are sold faster, but coats - for a long time. Will an ordinary accountant be able to do this kind of work? We doubt it. At the same time, the regular calculation of turnover and the application of the results can increase profits by 8-10%

In what areas is data analysis applicable?

  1. Sales. It is important to understand why sales are going well (or bad), what the dynamics are. To solve this problem, you need to research the factors that influence profit and revenue - for example, analyze the length of the check and the revenue per customer. Such factors can be investigated by product groups, seasons, stores. You can identify highs and sales pits by analyzing returns, cancellations, and other transactions.
  2. Finance. Monitoring indicators is necessary for any financier to monitor cash flow and allocate assets across various areas of business. This helps to assess the efficiency of taxation and other parameters.
  3. Marketing. Any marketing company needs predictions and post-stock analysis. At the stage of developing an idea, you need to determine the groups of goods (control and target) for which we are creating an offer. This is also a job for a data analyst, since an ordinary marketer does not have the necessary tools and skills for good analysis. For example, if the total revenue and the number of buyers for the control group are the same in comparison with the target group, the promotion did not work. Interval analysis is needed to determine this.
  4. Control. Leadership is not enough for a company leader. Quantitative assessments the work of the personnel is in any case necessary for the competent management of the enterprise. It is important to understand the efficiency of payroll management, the ratio of salaries and sales, as well as the efficiency of processes - for example, the workload of cash registers or the employment of loaders during the day. This helps to properly manage working hours.
  5. Web analysis. The site needs to be properly promoted in order for it to become a sales channel, and this requires the right promotion strategy. This is where web analysis comes in. How to use it? Study the behavior, age, gender and other characteristics of customers, activity on certain pages, clicks, traffic channel, the effectiveness of mailings, etc. This will help improve your business and website.
  6. Assortment management. ABC analysis is essential for assortment management. The analyst must distribute the product according to its characteristics in order to conduct this type of analysis and understand which product is the most profitable, which is the basis, and which one is worth getting rid of. To understand the stability of sales, it is good to conduct an XYZ analysis.
  7. Logistics. A better understanding of procurement, goods, their storage and availability will give the study of logistics indicators. Losses and needs of goods, inventory is also important to understand for successful business management.

These examples show how powerful data analysis can be, even for small businesses. An experienced CEO will increase the company's bottom line and benefit from the smallest insights by using data analytics correctly, and the manager's job will be greatly simplified by visual reports.