Skip to main content


Showing posts from November, 2018

How to Use Python for Data Analysis?

If you are here for data analysis using Python, then you must be aware that why Python is a great language for developers. So, in short, let me remind you some well-known classic aspects of Python without getting deep into the definition of “What is Python and so on”. Python is open source, free to use, have great collection of libraries, it is structured as well as object-oriented language, and has on offer great readability (unlike other languages) and of course great community support.

Well, you agree that Python is great and now you need to know how to use this language for data analysis. The process involves various steps, let’s get on the route.
Understanding the Type of Data: The foremost step is to identify the type of data available for analysis. Assume we have a huge data in excel sheets, with millions of rows and columns. Imagine, can you drive value from this data by using basic search and find commands in excel, may be you can. But it is going to be messy and time-consumi…

Is it Possible to use MySQL for Big Data Analysis?

MySQL is a popular relational database (RDBMS) for web applications (think of Twitter) and many other applications. Well, here the question is about magnanimousness of MySQL. I mean is it possible to use MySQL, in many ways, for Big Data analysis? Even if you think ‘yes’, the problem lies in the floating definition of Big Data, as a result creating a halo of confusion for new learners. Well, MySQL is good for data mining and online analytics (OLAP). So, does it mean we can scale/analyze Big Data using MySQL? The answer is not definitive, well there are two possibilities, let’s see to.

Big Data being generated continuously and large in amount needs something bigger like Hadoop for storing, managing and batch processing. However, we can connect MySQL to Hadoop for importing and exporting RDBMS using Apache Sqoop. HDFS is used for storing data, and for analysis the data can be passed on to MySQL. For instance, raw metrics can be stored in HDFS, however, summarized data can be sent to My…

Big Data Overview

What Topics in Python Should You learn for Data Analysis?

First off, understand there is difference between developing full-fledged software and doing data analysis using Python as a programming language. Clearly, here your aim is to do data analysis using Python, so learning Python becomes imperative for you. Right? Well, most of the people new to ‘big data’ and ‘data science’ go pell-mell, as they do not know where the correct essence of learning lies. They think that learning Python from A to Z will make them smarter, may be it can, but that's too much time consuming. As a new aspirant, you should be able to make out as what you should exactly learn for doing data analysis using Python.

In this post, we will go through the most-likely path which will make you self-confident in Python and subsequently in data analysis.

Step 1 - Basics:
Your learning process starts with rudimentary knowledge. Learning resources for general are different than selected learning. So, be it anything, you must learn the basics involved in Python. To learn…