Django is a high-level Python web framework designed to build secure and scalable web applications quickly. It follows the Model-View-Template (MVT) architectural pattern and includes built-in tools for authentication, database interactions, and URL routing.
Models in Django represent database tables as Python classes. Each attribute in a model corresponds to a field in the table. Example:
python
Copy code
class Book(models.Model): title = models.CharField(max_length=100)
The Object-Relational Mapper (ORM) in Django allows interaction with the database using Python code instead of SQL queries. Example:
python
Copy code
Book.objects.filter(title='Django')
The settings.py file contains project configurations like database setup, middleware, installed apps, and static files configuration.
An app is a modular component of a Django project that performs a specific function, such as managing users or blog posts. Use python manage.py startapp appname to create an app.
Migrations are Django's way of propagating changes in models to the database schema. Commands:
manage.py is a command-line tool for performing administrative tasks like starting the server, applying migrations, and creating superusers.
Middleware is a layer between the request and response cycle. It processes requests before the view and responses before sending them back to the client.
python
Copy code
@login_required def dashboard(request): return render(request, 'dashboard.html')
14. What is the difference between render() and redirect()?
Signals allow decoupled components to communicate. Example: pre_save, post_save.
This is Django's built-in authentication system for managing users, permissions, and groups.
Static files (CSS, JS, images) are managed using the STATICFILES_DIRS and STATIC_URL settings.
A QuerySet represents a collection of database queries. Example:
python
Copy code
Book.objects.all()
Django supports caching to optimize performance. Supported backends include Memcached, Redis, and local memory caching.
Class-based views organize views into classes rather than functions, offering better code reuse.
Python
Copy code
from django import forms class BookForm(forms.Form): title = forms.CharField(max_length=100)
Serializers convert complex data types into JSON. Example:
Python
Copy code
from rest_framework import serializers class BookSerializer(serializers.ModelSerializer): class Meta: model = Book
Migrations make schema changes to the database using Python code instead of raw SQL
ForeignKey: Many-to-one relationship.
URLConf maps URLs to view functions or classes.
Run: bash Copy code python manage.py createsuperuser
reverse() generates URLs dynamically based on view names.
It fetches an object from the database and raises an Http404 exception if the object doesn’t exist. Example:
python
Copy code
from django.shortcuts import get_object_or_404 book = get_object_or_404(Book, id=1)
Generic views are pre-built views for common tasks like displaying lists or detail pages. Example: ListView, DetailView.
It divides querysets into smaller chunks for pagination. Example:
python
Copy code
from django.core.paginator import Paginator p = Paginator(Book.objects.all(), 10)
Fixtures are serialized data used to populate the database. Example: JSON or XML files loaded via loaddata.
It defines the custom user model in Django projects using settings.AUTH_USER_MODEL.
bash Copy code python manage.py flush python manage.py makemigrations python manage.py migrate
A SlugField creates SEO-friendly URLs by converting text into URL-safe slugs. Example:
python
Copy code
slug = models.SlugField(unique=True)
It defines a many-to-many relationship between two models. Example:
python
Copy code
authors = models.ManyToManyField(Author)
They pass additional context to templates globally. Example: request, user, or custom variables.
By overriding admin.py and using methods like list_display and search_fields.
It returns JSON data from a Django view. Example:
python
Copy code
from django.http import JsonResponse return JsonResponse({'message': 'Hello, World'})
STATIC_ROOT is the directory where static files are collected using collectstatic.
MEDIA_ROOT is the directory where user-uploaded files are stored.
It improves performance by temporarily storing data. Supported backends include Redis and Memcached.
They transform variables in templates. Example: {{ name|upper }} converts text to uppercase.
Using DATABASES in settings.py.
It specifies valid domain names or IP addresses to prevent HTTP Host header attacks.
Sessions store user-specific data on the server side.
Django handles file uploads using FileField and MEDIA_ROOT.
A class-based view for rendering templates. Example:
python
Copy code
from django.views.generic import TemplateView class HomeView(TemplateView): template_name = 'home.html'
Signals are used for event-driven programming between decoupled components.
It disables CSRF protection for specific views. Example:
python
Copy code
from django.views.decorators.csrf import csrf_exempt @csrf_exempt def my_view(request): pass
It allows a method to act as an attribute.
Views handle HTTP requests and responses. Examples: APIView, ViewSet.
Mixins are reusable view logic components.
It converts a function-based view into an API view.
WSGI is the Python Web Server Gateway Interface, acting as an interface between web servers and Django.
ASGI is the Asynchronous Server Gateway Interface for handling async requests.
SQLite is the default database in Django.
It gathers static files into a single directory.
The MapReduce framework is the engine that processes the data in Hadoop. It consists of two main phases:
The Hadoop Distributed Cache is a mechanism that distributes files (such as configuration files, libraries, or archives) across the cluster for use by MapReduce tasks. It allows tasks to access these files locally without requiring network calls.
A block in HDFS is the smallest unit of storage in the system. The default block size in HDFS is 128 MB, but it can be configured based on the needs of the application.
The Hadoop client is an interface through which users interact with the Hadoop cluster. It allows for submitting jobs, querying data in HDFS, and performing other administrative tasks.
HDFS provides fault tolerance by replicating data blocks across multiple nodes. If one node or DataNode fails, the data can still be accessed from other replicas. The default replication factor is 3.
The purpose of HDFS is to store large datasets across multiple nodes in a Hadoop cluster. It ensures high availability, reliability, and scalability of data storage.
The Node Manager is responsible for managing the individual nodes in the cluster. It monitors resource usage, enforces resource limits, and reports node health to the ResourceManager.
In Hadoop 1.x, the JobTracker was responsible for managing the scheduling and execution of MapReduce jobs. It handled job execution, task tracking, and failure management.
The ResourceManager is responsible for managing the resources of the entire cluster. It schedules jobs, allocates resources, and ensures that applications get the resources they need to run efficiently.
The Hadoop Streaming API allows developers to write MapReduce programs in languages like Python, Ruby, or Perl. It lets users use custom mappers and reducers without needing to write Java code.
MapReduce processes data by dividing it into two phases:
HBase is a NoSQL database that is built on top of Hadoop and provides real-time read/write access to large datasets. It is designed for applications requiring random, low-latency access to data.
The Hadoop ecosystem consists of various tools and frameworks that complement the core Hadoop system. These include Hive (SQL-based querying), Pig (data flow language), HBase (NoSQL database), and Oozie (workflow scheduler).
A sequence file is a binary format that stores data in the form of key-value pairs. It is often used to store data for MapReduce jobs because it is more efficient than plain text files.
The Hadoop Common module contains the Java libraries and utilities required by other modules in the Hadoop ecosystem. It provides essential functions for data serialization, file system management, and job execution.
A combiner is an optional optimization in Hadoop that performs partial aggregation of the data in the Mapper before it is sent to the Reducer. This helps reduce the amount of data transferred between the Map and Reduce tasks.
A job is submitted to Hadoop via the Hadoop client. The client uses the JobClient class to submit the job, and the job is executed on the cluster using the available resources managed by the ResourceManager.
Zookeeper is a distributed coordination service used to maintain configuration information, provide synchronization, and manage the naming of services in distributed systems. It is crucial for systems like HBase and Kafka.
Data replication in HDFS ensures high availability and fault tolerance. If a DataNode fails, the system can retrieve the data from other replicas stored across different nodes, minimizing data loss.
The default replication factor in HDFS is 3, meaning each data block is replicated three times across different DataNodes in the cluster.
A TaskTracker is a daemon in Hadoop 1.x responsible for executing the tasks assigned by the JobTracker. It monitors the task's progress and reports back to the JobTracker.