Thursday, 11 June 2015

What is BIG Data




BIG Data is the large collection of data sets in terms of Terabytes or Petabytes and even more.

  • 1 Bit                   =  Binary Digit
  • 8 Bits                  = 1 Byte
  • 1024 Bytes          = 1 Kilobyte 
  • 1024 Kilobytes     = 1 Megabyte 
  • 1024 Megabytes   = 1 Gigabyte 
  • 1024 Gigabytes    = 1 Terabyte 
  • 1024 Terabytes    = 1 Petabyte 
  • 1024 Petabytes    = 1 Exabyte
  • 1024 Exabytes     = 1 Zettabyte 
  • 1024 Zettabytes   = 1 Yottabyte 
  • 1024 Yottabytes   = 1 Brontobyte
  • 1024 Brontobytes = 1 Geopbyte

BIG Data also refer as complex data sets which are not able to process through normal traditional data processing applications. Complex data is nothing but completely unstructured data in which we will not able to predict the data format such as image, audio and video files.

So BIG Data is the huge volume (size refers more than Terabytes) of data which can be in structured, semi-structured or un-structured data formats.

  • If we are able to predict the data format, then we call it as Structured data. Example: Data from RDBMS package, ERP, CRM etc..
  • If we partially able to predict the data format, then we call it as Semi-structured Data. Example: XML Files
  • If we are not able to predict the data format, then we call it as Un-structured data. Example: Images, Audio and Video files.


Some real time examples for BIG Data are

  • Stock Market generates around one terabyte of new trade data every day.
  • Telecommunication industry generates more than 100 million CDR(Calling Data Records).
  • Twitter generates more than 5000-1000 tweets in seconds