About Big Data Analytics


About Data & Analytics

In this digital era, more and more things and activities are recorded as data. Data size is growing exponentially and data variety is continuously increasing. However, most of data is unstructured raw data which does not have much value to business operations and strategies. 

Big data analytics is to analyze large size of data in various forms from various resources and turn massive amount of data into valuable information, strategies, and actions. Big data usually means gigabyte and terabyte in size. Although calculation principles are the same regardless of data size, working processes are very different between big size of data and small size of data. Big data analytics requires sophisticated computation algorithms to process data step-by-step toward goals for business solutions. In order to facilitate the entire process, high speed computers (Server or cloud computing), large data storages, advanced analytics software (like SQL, Python, SAS or R), sophisticated computer programs, and efficient data transfer are necessary. 

Big data analytics involves complicated and lengthy data analysis work, such as data processing, transformation, integration, measurement creation, aggregation, prediction, relationship discovery, pattern detection, strategy development, and action fulfillment. In business data analytics, understanding data meaning and incorporating business rules and objectives in designing analytical processes, measurements, and models are critical to generate sensible results and successful outcomes for businesses.  For different projects and different companies, measurements and decision rules in computing processes can be very different. For example,  measurements for product management can be different from customer loyalty management. And also, data aggregations and presentations are mirrored from organization structures, product hierarchies, geography, time frames, and business goals. Furthermore, advanced analysis, like predictive modeling, pattern recognition, strategy development, and decision making, is expanding to more and more business areas.

Data is growing, so is big data analytics. Well planned and designed, and accurate analysis can really help companies to see facts and insights and develop strategies and take timely actions for business successes.

Types of Data for Analysis and Computer programing

Data is a general term. It has various types. There are over one dozen of commonly used data types in data analytics and computer programming. 

Numerical Data is a basic data type in data analysis and computer programming. Numerical data includes integer and decimal,  which are numbers, like 1, 3, 39, 2163, 2.76, 268.63 .... . This data type is the most commonly used data type in retail, finance, medical, engineering, sciences, ... . This type of data can be calculated by mathematic formula and operations ( e.g. +, -, x, ... ) . Another commonly used data type is string (character), such as name, address, description. This type of data can not be simply calculated but it can be very important as an indicator or description in analysis.  In addition, other commonly used data types include boolean (e.g. True or False ) , date (e.g. January 26, 2019), time (e.g. 9:26am) , binary (e.g. 1 or 0), file, etc. These data type are also very useful in computer programming in data analysis and application design. With information technology evolution, more data forms have been recognized, such as machine data, and image data. Good thing is that data type definitions in statistical programs and computer language programing are almost same, no matter statistical  programing  (in SAS and Python) or computer programming (C  and Java).  

In real life, most of data is unstructured.  In order to have data work in programming languages, data  has to be extracted, cleaned, and transformed. For example, a simple description about  sales ' Total sales of SunShine Plastic water bottles in 3rd quarter in stores in California is $36,2356' can be stored in different way in database. It can be stored in database in one column which can not be used for analysis. In analytical consideration, this piece of information contains at least 8 descriptive data ( Brand: SunShine; Product Type: Plastic Bottle; Time Period: 3rd quarter, Sales Channel: in store; Geography: California; Measurement: total sales; measure unit: US$; Sales Amount: $36,2356). If this piece of information is stored in database, it can take 8 columns (fields) . 

In data analysis and computer programing, information extraction, clean, transformation, and integration is import and basic steps to have data to work for businesses. 

Data Preparation Basics

Data collection, cleaning, extraction, restructure, and transformation are initial and important steps in any data analyses. Market data comes from various resources and in various forms. Some data has intuitive meaning which can be used in market analysis directly, like sales, price, brand, customer, and others may not have intuitive meaning (like weather, special events, inflation rate) but can be very helpful in finding causes of market performances, sales changes, and customer shopping patterns. Market analysis is based on data, and entire analytical process depends on data passing logically through algorithmic computation programs. Although data collection and integration are time consuming, they are very important to ensure analyses meaningful and valuable to businesses. 

There are many types of data from various data sources. Data can be numerical and categorical, and continuous numbers or integers. Traditional market data includes sales, price, products hierarchy and characteristics, competition, promotion, season, demographics, and geography. With information technology rapid evolution, more and more data is available for market analyses, such as shopping basket data, online product search data, online order and product return data, and coupon redemption data, advertising view and click data, and so on. Since each market analysis project aims at answering market questions or solving specific issues or finding better ways to reach marketing goals, data for market analyses may not always be the same and analytical methods and market metrics should be designed for specific project needs. With finding market issues and developing solutions in mind, data analysis should be started from selecting data and verifying data, integrating data in useful and meaningful forms, planning practical analysis methods, and defining project scopes. Meaningful and well-structured data is the foundation of analyses for digging out causes of market performances, understanding consumer insights, and developing achievable marketing strategies. 

Data collection and integration are detailed and time consuming work. We can help turning massive raw data into meaningful and useful information for market analysis. 

Data Aggregation for Business Analysis

Data Visualization

Text Data Analysis