Lexical Analysis in Compiler Design for GATE 2026 CS&IT – Complete Guide जो Exam में Guarantee Marks दिलाए

By Satyajit

Jan 3, 2026

Lexical Analysis in Compiler Design for GATE

Post Views: 287

अगर आप GATE 2026 CS & IT की serious preparation कर रहे हैं, तो Lexical Analysis ऐसा topic है जिसे कभी भी ignore नहीं किया जा सकता। Compiler Design से हर साल exam में direct या indirect questions पूछे जाते हैं और उनमें Lexical Analysis सबसे ज्यादा scoring हिस्सा होता है।

इस article में आप जानेंगे कि Lexical Analysis कैसे काम करता है, tokens कैसे बनते हैं, regular expressions और finite automata का practical use क्या है, और किस तरह GATE में बार-बार पूछे जाने वाले tricky concepts को आसानी से solve किया जा सकता है। इस guide को पढ़ने के बाद आप confidently Lexical Analysis के questions attempt कर पाएँगे।

Table of Contents

What is Lexical Analysis in Compiler Design?

Lexical Analysis Compiler Design का पहला phase होता है जिसमें source program को छोटे-छोटे meaningful parts में तोड़ा जाता है जिन्हें tokens कहा जाता है।

Lexical Analyzer source code को character by character पढ़ता है, unnecessary चीजें जैसे spaces, tabs और comments को remove करता है और keywords, identifiers, operators, literals तथा delimiters को पहचान कर उन्हें tokens में convert करता है।

सरल शब्दों में कहें तो Lexical Analysis source code और compiler के बीच एक bridge का काम करता है, जिससे आगे के phases program को आसानी से समझ सकें।

Compiler Design में Phases का Overview

Compiler source program को step-by-step process करके machine code में convert करता है। यह पूरा काम 6 main phases में होता है।

Phase	Work/ Role
Lexical Analysis	Characters → Tokens
Syntax Analysis	Grammar checking
Semantic Analysis	Meaning checking
Intermediate Code	Middle level code
Code Optimization	Improve performance
Code Generation	Machine code output

ये सभी phases मिलकर compiler को powerful बनाते हैं और GATE exam में हर phase से questions पूछे जाते हैं, इसलिए इन्हें कभी skip न करें।

Compiler Design में Lexical Analysis का Role

यह compiler का पहला phase होता है।

Source code को character by character पढ़ता है
Keywords, identifiers, operators, constants को पहचान कर tokens बनाता है
Spaces, tabs, comments को remove करता है
Symbol Table बनाता है

यह भी पढ़ें: GATE 2026 CS & IT Subject Wise Weightage – Syllabus का सबसे बड़ा Secret (High-Scoring Topics)

Lexical Analysis in Compiler Design – Step by Step Working

जब compiler source program को read करता है, तब:

Character by character input पढ़ता है
Unnecessary चीजें जैसे space, newline, comments को ignore करता है
Keywords, identifiers, constants, operators को पहचानता है
हर valid part को एक token में convert करता है

Example Source Code:

#include <stdio.h>;

int main() {
    int a, b, sum;

    printf("Enter first number: ");
    scanf("%d", &a);

    printf("Enter second number: ");
    scanf("%d", &b);

    sum = a + b;

    printf("Sum of two numbers = %d\n", sum);

    return 0;
}

Lexical Analyzer इसे ऐसे tokens में बदल देगा:

Token Type	Lexeme
Keyword	int
Identifier	sum
Operator	=
Identifier	a
Operator	+
Identifier	b
Delimiter	;

Token, Lexeme और Pattern क्या होते हैं?

*Term*	*Meaning*
Token	Category या class (Identifier, Keyword etc.)
Lexeme	Actual text from program
Pattern	Rule to identify token

Example:
sum = a + b;

Token → Identifier
Lexeme → sum
Pattern → [a-zA-Z][a-zA-Z0-9]*

Token Categories in Lexical Analysis

1. Keywords

Fixed reserved words
Example: int, float, return, if

2. Identifiers

Variable या function names
Example: main, total, number1

3. Operators

+ , – , * , / , =

4. Literals

Constants values
Example: 10, 3.14, “Hello”

5. Punctuations / Delimiters

; , ( ) { }

Regular Expressions & Finite Automata

Lexical Analysis में token पहचानने के लिए Regular Expressions और Finite Automata (FA) का use होता है।

Example Regex:

Identifier → [a-zA-Z][a-zA-Z0-9]*

Finite Automata:

DFA – Deterministic Finite Automata
NFA – Non-Deterministic Finite Automata

Lexer हमेशा DFA के रूप में काम करता है क्योंकि DFA fast और deterministic होता है।

यह भी पढ़ें: Difference between Compiler and Assembler

Working of Lexical Analyzer (Simple Flow)

Source Code → Scanner → Tokens → Symbol Table → Parser

Symbol Table क्या है?

Symbol Table में identifiers की information store होती है जैसे:

Variable name
Type
Scope
Memory address

Lexical Errors & Handling

Lexical Errors तब आते हैं जब:

Invalid character मिले
Identifier गलत format में हो
Unterminated string हो

Example:

import re

def is_valid_identifier(identifier):
    # Regular expression for valid identifier
    pattern = r'^[a-zA-Z_][a-zA-Z0-9_]*$'
    return re.match(pattern, identifier) is not None

def lexical_analyzer(statement):
    tokens = statement.replace(';', '').split()

    for token in tokens:
        if token.isdigit():
            print(f"Constant detected: {token}")
        elif token in ['int', 'float', 'double', 'char']:
            print(f"Keyword detected: {token}")
        elif is_valid_identifier(token):
            print(f"Valid Identifier detected: {token}")
        else:
            print(f"Lexical Error! Invalid identifier: {token}")

# Example with lexical error
code = "int 2sum;"
print("Analyzing code:", code)
lexical_analyzer(code)

Output:

Analyzing code: int 2sum;
Keyword detected: int
Lexical Error! Invalid identifier: 2sum

यह invalid है क्योंकि identifier digit से start नहीं हो सकता।

Difference Between Lexical Analysis and Syntax Analysis

Lexical Analysis	Syntax Analysis
Characters से tokens बनाता है	Tokens से grammar check करता है
First Phase	Second Phase
Example: int पहचानता है	int a=10; सही है या नहीं check करता है

Lexical Analysis Previously Asked Questions in GATE Exam

Q1. In a compiler, keywords of a language are recognized during which phase?

Ans. Lexical Analysis

Q2. Which data structure in a compiler is used for managing information about variables and their attributes?

Ans. Symbol Table

Q3. Match the following:

Set X: P. Lexical Analyzer, Q. Syntax Analyzer, R. Intermediate Code Generator, S. Code Optimizer

Set Y: 1. Abstract Syntax Tree, 2. Token, 3. Parse Tree, 4. Constant Folding

Ans.

P → 2 (Token)

Q → 3 (Parse Tree)

R → 1 (Abstract Syntax Tree)

S → 4 (Constant Folding)

Q4. Which of the following is NOT a function of the lexical analyzer ?

(A) Removing comments and white spaces

(B) Generating tokens

(D) Error reporting for illegal identifiers

Ans. (C) Parsing expressions (belongs to syntax analysis)

Q5. The output of a lexical analyzer is:

Ans. A stream of tokens.

Q6. Which tools is commonly used to implement lexical analysis ?

Ans. Lex/Flex.

Q7. The lexical analyzer uses which type of machine for token recognition ?

Ans. Finite automate (FA)

Q8. Which of the following errors are detected by a lexical analyzer?

(A) Spelling errors in keywords

(B) Missing semicolons

(D) Undeclared variables

Ans. Spelling errors in keywords.

Q9. The role of a lexical analyzer includes:

(A) Removing comments

(B) Removing white spaces

(D) All of the above

Ans. (D) All of the above.

Q10. Which of the following pairs is correctly matched?

(A) Lexical Analysis → Regular Expressions

(B) Syntax Analysis → Pushdown Automata

(D) Dataflow Analysis → Optimization

Ans. (A), (B), (C), (D) all are correct matches

Preparation Tips for GATE Aspirants

हर दिन 10 MCQs solve करें
DFA और Regex practice करें
Previous Year Questions analyse करें
Error based questions जरूर पढ़ें

यह भी पढ़ें: GATE Exam Syllabus 2026: CS vs DS किसका Syllabus Tough – जानिए Real Difference

Important Links for GATE 2026

Official Website – Click Here
Official GATE CS Syllabus in PDF – Click Here

निष्कर्ष (Conclusion)

Lexical Analysis एक ऐसा topic है जो simple भी है और scoring भी। अगर आप tokens, regex, DFA, और errors को clearly समझ लेते हैं, तो GATE 2026 में Compiler Design section से अच्छे marks guaranteed हैं।

अगर आप इस chapter को sincerely prepare कर लेते हैं, तो आपका confidence level automatically boost हो जाएगा।

Share with Social

Satyajit

में सत्यजीत इस वेबसाइट का Founder और ब्लॉग पोस्ट के लेखक हूँ । मैंने Information Technology में स्नातक (BSc-IT) और Computer Application में मास्टर डिग्री (MCA) की हैं ।

Lexical Analysis in Compiler Design for GATE 2026 CS&IT – Complete Guide जो Exam में Guarantee Marks दिलाए

What is Lexical Analysis in Compiler Design?

Compiler Design में Phases का Overview

Compiler Design में Lexical Analysis का Role

Lexical Analysis in Compiler Design – Step by Step Working

Token Categories in Lexical Analysis

1. Keywords

2. Identifiers

3. Operators

4. Literals

5. Punctuations / Delimiters

Regular Expressions & Finite Automata

Example Regex:

Finite Automata:

Working of Lexical Analyzer (Simple Flow)

Symbol Table क्या है?

Lexical Errors & Handling

Difference Between Lexical Analysis and Syntax Analysis

Lexical Analysis Previously Asked Questions in GATE Exam

Preparation Tips for GATE Aspirants

Important Links for GATE 2026

निष्कर्ष (Conclusion)

RELATED ARTICLES

Data Representation in Computers क्या है – Complete Guide for Beginners in Hindi

Computer Organization and Architecture क्या है | COA for Beginners Guide in Hindi

What is Space Complexity in DSA – Beginners के लिए Complete Guide in Hindi

Leave a Comment Cancel reply

Tazahindi

Lexical Analysis in Compiler Design for GATE 2026 CS&IT – Complete Guide जो Exam में Guarantee Marks दिलाए

What is Lexical Analysis in Compiler Design?

Compiler Design में Phases का Overview

Compiler Design में Lexical Analysis का Role

Lexical Analysis in Compiler Design – Step by Step Working

Token Categories in Lexical Analysis

1. Keywords

2. Identifiers

3. Operators

4. Literals

5. Punctuations / Delimiters

Regular Expressions & Finite Automata

Example Regex:

Finite Automata:

Working of Lexical Analyzer (Simple Flow)

Symbol Table क्या है?

Lexical Errors & Handling

Difference Between Lexical Analysis and Syntax Analysis

Lexical Analysis Previously Asked Questions in GATE Exam

Preparation Tips for GATE Aspirants

Important Links for GATE 2026

निष्कर्ष (Conclusion)

RELATED ARTICLES

Data Representation in Computers क्या है – Complete Guide for Beginners in Hindi

Computer Organization and Architecture क्या है | COA for Beginners Guide in Hindi

What is Space Complexity in DSA – Beginners के लिए Complete Guide in Hindi

Leave a Comment Cancel reply

Tazahindi.com

This is a Website where you get articles related to Govt Jobs, Education, Technology & AI Tools in Hindi. We also covered all subjects of Computer Science, Programming & Web development.

Top Categories

Our Important Pages

Follow Us

© 2025 Tazahindi - All rights reserved.

Designed by Satyajit