Skip to main content

Science and Technology

Big Data, Cloud Computing and Edge Computing

Computer Science: Networking, Telecom, AI/ML, Big Data, Cloud/Edge Computing, IoT, Blockchain, Digital Currency, VR/AR, OTT, Social Media

Paper II · Unit 2 Section 5 of 12 0 PYQs 31 min

Public Section Preview

Big Data, Cloud Computing and Edge Computing

4.1 Big Data

Big Data refers to datasets too large and complex for traditional data processing software. Characterised by the 5 Vs:

V Description Example
Volume Massive scale (terabytes → exabytes) Facebook: 4 petabytes of data/day
Velocity Speed of data generation and processing Twitter: 500 million tweets/day
Variety Multiple formats (text, images, video, sensor data) Healthcare: EHR + imaging + genomics
Veracity Accuracy and trustworthiness of data Social media noise, fake reviews
Value Business/societal insight extracted Fraud detection, personalised medicine

Big Data Technologies:

  • Hadoop (2006): Open-source distributed storage and processing framework (HDFS + MapReduce); enables parallel processing across thousands of servers
  • Apache Spark: Fast, in-memory data processing — 100× faster than Hadoop MapReduce; supports real-time streaming
  • NoSQL Databases: Non-relational; handle unstructured data — MongoDB (document store), Cassandra (column store), Redis (key-value), Neo4j (graph database)

Applications:

  • Healthcare: Analysing EHR data + genomics + imaging → precision medicine (IBM Watson Health)
  • Finance: Fraud detection across millions of daily transactions
  • Government: UIDAI processes 50 million+ Aadhaar verifications daily using big data infrastructure
  • Agriculture: Analysis of satellite imagery + weather data + soil sensors → precision farming

4.2 Cloud Computing

Cloud computing delivers computing resources (servers, storage, databases, networking, software, analytics) over the Internet with on-demand availability, scalability, and pay-per-use pricing.

Deployment Models:

Model Description Example
Public cloud Resources owned/managed by third-party provider AWS, Microsoft Azure, Google Cloud
Private cloud Dedicated cloud for single organisation NIC Cloud (India's government cloud — MeghRaj)
Hybrid cloud Combination of public and private Most large enterprises
Multi-cloud Using multiple cloud providers Using AWS + Azure + GCP together

Service Models:

Model What Customer Controls Provider Controls Examples
IaaS (Infrastructure) OS, middleware, data, applications Hardware, virtualisation, network AWS EC2, Azure VMs, Google Compute Engine
PaaS (Platform) Data, applications OS, runtime, middleware, hardware Google App Engine, Heroku, Azure App Service
SaaS (Software) User data only (configuration) Everything else Gmail, Salesforce, Office 365, Zoom

India's GovCloud — MeghRaj

  • National Cloud of India under NIC (National Informatics Centre); launched 2014
  • Hosts government applications: GSTN (GST portal), Aadhaar systems, government e-mail
  • DigiSakshee — cloud-based election management infrastructure

4.3 Edge Computing

Edge computing processes data at or near the source (edge of the network) rather than sending all data to a central cloud data centre.

Why edge computing?

  • Latency: Cloud round-trip = 50–200 ms; edge = <5 ms — critical for autonomous vehicles, industrial robots, AR/VR, remote surgery
  • Bandwidth: Only relevant/processed data sent to cloud, reducing costs
  • Privacy: Sensitive data (patient health, factory floor) processed locally, not sent to cloud

Edge vs. Fog vs. Cloud:

  • Cloud: Centralised, powerful, high latency
  • Fog: Intermediate nodes between edge and cloud (Cisco's concept)
  • Edge: Processing at the device or local server level