Skip to main content

Generators in Python

Generators

In simple words, Generator is a function that works like an iterator. 

There are some key differences between a generator function and a regular function. 
  • Generator function must contain at least one yield statement. Regular function doesn't contain any yield statement.
  • Once the generator yields, the control is returned to the caller immediately and execution of generator is passed until the next value is requested by the caller. In a regular function, once the 'return' statement is run, function processing is completed and control is returned to the caller.
  • Generator remembers the local variables (and values) between each call (iteration) until the generator is terminated.
  • Generators are memory efficient and faster when compared to the regular functions. 

Let's have a look at the simple generator that contains multiple yield statements and compare how it is different from a regular function. 

1

2

3

4

5

6

7

8

9

10

11

12

def sample_generator():

   # Simple generator with multiple yield statements

   yield "This"

   yield "is"

   yield "a"

   yield "generator"



# Iterate through generator in for loop

for text in sample_generator():

   print(text)


Result

This

is

a

generator


In the above example, 
  • No arguments are passed to the generator function. 
  • Four yield statements are used. 
    • Each time yield statement is executed, control is passed back to the caller. In this case generator is used in the for loop. 
    • Generator returns 'StopIteration' when it is completed. For loop would automatically stop iterating. 
Next value from the generator can also be fetched by using 'next' function. But, this has to be explicitly coded for the number of times next value is to be fetched or this needs to be in a for loop again. 

9

10

11

12

13

14

15

16

gen = sample_generator()

print(next(gen))

print(next(gen))

print(next(gen))

print(next(gen))

print(next(gen))

print("Generator execution is completed")



Result

This

is

a

generator

Traceback (most recent call last):

  File "/Users/Admin/PycharmProjects/pythonProject/main.py", line 25, in <module>

    print(next(gen))

StopIteration



In the above example, 
  • We are calling 'next' function five times by passing the generator function as an argument. 
  • Generator only contains four yield statements and when next function is called fifth time, An exception (StopIteration) is returned and none of the following statements are run. 

Above generator is a very simple generator with multiple yield statements. Let's have a look at another example to return the square of each number with in the range of integer passed. 

Regular Function

Let's have a look at regular function first. 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

17

18

19

def square_function(number):

   # Square list

   square_list = []


   # Append square of each number in the range

   for num in range(1, number):

       square_list.append(num ** 2)


   # Return a list with squares

   return square_list



# Store the result to a list

sq_list = square_function(10)


# Iterate through square function

for square in square_function(10):

   print(square)



Result

1

4

9

16

25

36

49

64

81


In the above example, 
  • Lines 1 - 10: Function 'square_function' to return the list of squares of each number in the range of an integer passed. 
    • Line - 3: Create an empty list. 
    • Line - 6: Iterate through the range using for loop. 
    • Line - 7: Append square of the number to the list. 
    • Line - 10: Return the list. 
  • Line - 14: When a square_function is called, A list would be returned.
  • Line - 17: square_function can directly be called inside for loop. As the function returns a list, for loop would consider it as iterating through a list.
    • When the function is called, all the values are populated to a list and returned at once. 

Generator Function

Let's now have a look at the generator to do the same. 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

def square_generator(number):


   # Yield square of each number

   for num in range(1, number):

       yield num ** 2



# A generator object is created

sq_gen = square_generator(10)


# Iterate through square generator

for square in square_generator(10):

   print(square)


Result

1

4

9

16

25

36

49

64

81


In the above example,
  • Lines 1 - 5: Generator 'square_generator' to return a generator which can be iterated over the squares. 
    • Line - 4: For loop to iterate through the range of numbers. 
    • Line - 5: Yield the square of each of the number in for loop. 
      • Every time a yield statement is run, control is passed back to the caller. 
      • When the next value is requested by the caller, generator would continue running from the next step.
  • Line - 9: When a generator is assigned, generator object is created. This can either be iterated or converted to a list by using list() function. 
  • Line - 12: A generator can be directly used in a loop to iterate through all the elements. 

As mentioned above, Generators are memory efficient and run faster compared to regular functions. That is due to the nature of how generators work. Generator remembers the current state of variables and returns (yield) the next value when requested by the caller makes the performance faster. Where as in the regular functions all the data is processed at once. This may not make much difference when working with small data sets. But, would be helpful while working with large data sets.

Return statement in generator

One other thing to mention here is the difference in the way 'return' statement works in generator. 

1

2

3

4

5

6

7

8

9

10

11

12

13

14

15

16

def square_generator(number):


   # Yield square of each number

   for num in range(1, number):

       yield num ** 2

       return num



# A generator object is created

sq_gen = square_generator(10)

print(sq_gen)  # Generator object


# Iterate through square generator

for square in square_generator(10):

   print(square)



Result

<generator object square_generator at 0x7fe5b0195740>

1


In the above example, 
  • Line - 6: We are using 'return' statement after yield in for loop. 
    • First time when yield statement is run (Line - 5), control is returned back to the caller.
    • On the next iteration 'return' statement is run and Generator is terminated. 
  • Line - 10: Generator function is assigned to 'sq_gen'.
    • In case of regular function, when a function is assigned to the variable, return value is assigned (if present, Null if no return value). 
    • But for generators, generator object is created which can be iterated. 

Hope the above info was useful in understanding the how generators work in Python. 


If you have any Suggestions or Feedback, Please leave a comment below or use Contact Form.

Comments

Popular posts from this blog

All about READ in RPGLE & Why we use it with SETLL/SETGT?

READ READ is one of the most used Opcodes in RPGLE. As the name suggests main purpose of this Opcode is to read a record from Database file. What are the different READ Opcodes? To list, Below are the five Opcodes.  READ - Read a Record READC - Read Next Changed Record READE - Read Equal Key Record READP - Read Prior Record READPE - Read Prior Equal Record We will see more about each of these later in this article. Before that, We will see a bit about SETLL/SETGT .  SETLL (Set Lower Limit) SETLL accepts Key Fields or Relative Record Number (RRN) as Search Arguments and positions the file at the Corresponding Record (or Next Record if exact match isn't found).  SETGT (Set Greater Than) SETGT accepts Key Fields or Relative Record Number (RRN) as Search Arguments and positions the file at the Next Record (Greater Than the Key value). Syntax: SETLL SEARCH-ARGUMENTS/KEYFIELDS FILENAME SETGT  SEARCH-ARGUMENTS/KEYFIELDS FILENAME One of the below can be passed as Search Arguments. Key Fiel

What we need to know about CHAIN (RPGLE) & How is it different from READ?

CHAIN READ & CHAIN, These are one of the most used (& useful) Opcodes by any RPG developer. These Opcodes are used to read a record from file. So, What's the difference between CHAIN & READ?   CHAIN operation retrieves a record based on the Key specified. It's more like Retrieving Random record from a Database file based on the Key fields.  READ operation reads the record currently pointed to from a Database file. There are multiple Opcodes that start with READ and all are used to read a record but with slight difference. We will see more about different Opcodes and How they are different from each other (and CHAIN) in another article. Few differences to note.  CHAIN requires Key fields to read a record where as READ would read the record currently pointed to (SETLL or SETGT are used to point a Record).  If there are multiple records with the same Key data, CHAIN would return the same record every time. READE can be used to read all the records with the specified Ke

Extract a portion of a Date/Time/Timestamp in RPGLE - IBM i

%SUBDT Extracting Year, Month, Day, Hour, Minutes, Seconds or Milli seconds of a given Date/Time/Timestamp is required most of the times.  This can be extracted easily by using %SUBDT. BIF name looks more similar to %SUBST which is used to extract a portion of string by passing from and two positions of the original string. Instead, We would need to pass a value (i.e., Date, Time or Timestamp ) and Unit (i.e., *YEARS, *MONTHS, *DAYS, *HOURS, *MINUTES, *SECONDS or *MSECONDS) to %SUBDT.  Valid unit should be passed for the type of the value passed. Below are the valid values for each type. Date - *DAYS, *MONTHS, *YEARS Time - *HOURS, *MINUTES, *SECONDS Timestamp - *DAYS, *MONTHS, *YEARS, *HOURS, *MINUTES, *SECONDS, *MSECONDS Syntax: %SUBDT(value : unit { : digits { : decpos} }) Value and Unit are the mandatory arguments.  Digits and Decimal positions are optional and can only be used with *SECONDS for Timestamp. We can either pass the full form for the unit or use the short form. Below i