Skip to main content

Working with Sets in Python

Sets in Python

Set is a collection of data (of different data types) and is specified using curly brackets. 

Set is one of the Sequence data types in Python and has unique features compared to the other data types (list, tuple and dictionary).

Below are some of the features of Sets. 
  • Sets can hold data of different data types.
  • Data in the set can only be added or removed and cannot be amended.
  • Data in the set is unordered (i.e., every time a program is run, same data is stored in different order) and cannot be accessed by using index.
  • Duplicate data is not allowed in Sets.
In this post, we will see how to access, add or remove the data from sets and the use of different set methods. 

Creating a set in Python

Before we go on to different methods of a set, let us see How to create a set in Python.

A set is created when the data is assigned with in curly brackets (Unlike dictionaries, no key value would be present in Sets). 

Syntax:

new_set = {"Value1", "Value2",...}

Creating a set in Python

As I mentioned in the features, data is stores in no specific order and might vary between each run. Let's try running the above code twice and see how the data in the set is stored.

Result - 1:

Set in Python

Result - 2:

Set in Python

Creating a set by copying the data

Other way of creating a set is by copying the data from another set. This can be done by using the method copy(). This method doesn't accept any arguments.

Syntax:

new_set = old_set.copy()

Creating a set by copying data from the other set

In the above example, A set is created by by copying the data from another set which is already created. 

Set in Python

Accessing the data from a Set

Data in the set is neither indexed nor ordered. So, how do we access the data in a set? Specific value from set cannot be accessed as there is no index or key. But, Data in the set can be accessed by iterating in a loop.

Let's have a look at the example using for loop to access the data from a set. 

Access the data from a set in Python

In the above example, for loop is executed for each value present in the set. Again, the data in the set is stored in no specific order, so every time the program is run, sequence of the data returned could be different. 

Set in Python

Adding the data to a Set

Data present in the set cannot be amended. But, new data can be added by using the method 'add()'. This method accepts a single argument (i.e., data to be added). 

Syntax:

new_set.add("Data to be added")

Let's have a look at the example. 


In the above example, we are using add method twice.
  • Line - 4: We are adding "NEW" to the set. "NEW" is already present, so another entry won't be added in the set. Duplicate values aren't allowed. 
  • Line - 5: We are adding "NEW DATA" to the set. This would added to the set in no specific order.
If we print a set after adding the new data, Just like creating a set, data wouldn't be in specific order. 

Set in Python

Updating a set with data from another set

add() method is helpful if there is only one element that needs to be added to a set. If we need to add a set of values from another set, update() method is helpful to do this. This method accepts an iterable as an argument and adds each element in the iterable to the set. Data in the set passed as an argument won't be updated. 

Syntax:

set_one.update(set_two)

This can be easily understood with below example. 

Add data to set from the other set

In the above example, 
  • Line - 7: Passing the set as an argument to update method would add the data to 'set_one'. Any duplicate values will not be added. 
  • Line - 12: We are passing a string as an argument. Unlike, the add() method string won't be added to a set as is. update() method would consider the string as an iterable and adds each character as separate value in a set. 
Set in Python

Deleting the data from a Set

There are multiple methods to delete the data from a set. 
  • remove()
  • discard()
  • pop()
remove() method accepts the data that needs to be removed from a set. 

Syntax:

set_one.remove("Data to be removed")

Remove the data from a set in Python

Value passed in the argument would be removed from a set. If a value that is not present in the set is passed an exception will be thrown. 

Set in Python

Like remove() method, discard() method accepts an argument that needs to be removed from a set. 

set_one.remove("Data to be removed")

Remove the data from a set in Python


So, what's the difference between remove() and discard methods()? remove() method throws exception when the data that is not present in the set is passed. And, discard() method would ignore the exception if the data not present in a set is passed.

pop() method works in a different way compared to remove() and doesn't accept any argument. Data would be deleted from the set randomly and returns the data that is deleted. 

Syntax:

set_one.pop()

Remove data from a set using pop method

Data deleted in each call could be different as the data is stored in no specific order. 

Set in Python

There is one other way to delete the data from a set is by clearing the data from a set using the clear() method. This would clear all the data present in a set. 

Syntax:

set_one.clear()

Clear data from a set in Python

Union and Intersection of two sets

union() and intersection() are the two useful methods when working with two different sets. 

union() - accepts a set as an argument and returns the set with the data from both the referring sets (any duplicate data would only present once in the result).

Syntax:

new_set = set_one.union(set_two)

intersection() - accepts a set as an argument and returns the set with the data that is present in both the referring sets.

Syntax:

new_set = set_one.intersection(set_two)

With the use of above union and intersection methods, data from both the sets won't be affected. New set would be returned with the corresponding result.

There is one other method to do intersection is intersection_update(). This doesn't return a resulting set instead updates the calling set with the result. i.e., existing data from the set would be cleared and only common data between both the sets would be updated. 

Below example shows the use of all three methods. 

Union and Intersection of sets in Python

In the above example, 
  • Line - 7: union() method would return the data from both the sets and would be printed by print function. Both sets 'set_one' and 'set_two' won't be updated with this operation.
  • Line - 11: intersection() method would return the data that is common to both the sets and printed by print function. Both sets 'set_one' and 'set_two' won't be updated with this operation.
  • Line - 16: intersection_update() method would update the 'set_one' with the data that is present in both the sets 'set_one' and 'set_two'.
Set in Python

Relationship between two sets

While working with data, It becomes essential to check the relationship between different sets. Like, 
  • If a set is sub set of other set
  • If a set is super set of other set
  • If two sets are disjoint
Below methods are helpful achieve this. 

issubset()

This method checks if a set is a subset of the set passed in the argument and returns True if it is a subset and False if not a subset.

Syntax:

set_one.issubset(set_two)

issuperset()

This method checks if a set is a superset of the set passed in the argument and returns True if it is a superset and False if not a superset. 

Syntax:

set_one.issuperset(set_two)

isdisjoint()

This method checks if a set is disjoint of the set passed in the argument (i.e., no elements are common between both the sets).

Syntax:

set_one.isdisjoint(set_two)

Below is the simple example using these three methods. 

Subset, superset and disjoint in Python

Difference between two sets

Identifying the difference between two sets becomes essential when working with data. Below methods are helpful to retrieve the difference between the sets. 

difference()

Returns the data present in a set and not present in the set passed as an argument. Difference data would be returned as a set. None of the two sets would be updated by using this method. 

Syntax:

new_set = set_one.difference(set_two)

difference_update()

Updates a set with the data that is not present in set passed in the arguments. In other words, elements present in both the sets would be removed from the initial set. 

This method doesn't return any value.

Syntax:

set_one.difference_update(set_two)

Below is a simple example by using these two methods. 

Difference between two sets in Python

symmetric_difference()

difference() method would only return the data difference that is present in the set the method is associated with and ignores the data present in the set passed in the argument. 

symmetric_difference() method would consider both the sets and return the data by removing the common elements. In simple words this is like opposite of intersection. 

Syntax:

new_set = set_one.symmetric_difference(set_two)

symmetric_difference_update()

This method doesn't return a set instead updates the associated set with the data from both the sets after removing the common elements. 

Syntax:

set_one.symmetric_diiference_update(set_two)

Below is a simple example by using these two methods. 

Symmetric difference between sets in Python


Hope the above details were a bit of help to you in understanding more about Sets in Python.


If you have any Suggestions or Feedback, Please leave a comment below or use Contact Form.

Comments

Popular posts from this blog

All about READ in RPGLE & Why we use it with SETLL/SETGT?

READ READ is one of the most used Opcodes in RPGLE. As the name suggests main purpose of this Opcode is to read a record from Database file. What are the different READ Opcodes? To list, Below are the five Opcodes.  READ - Read a Record READC - Read Next Changed Record READE - Read Equal Key Record READP - Read Prior Record READPE - Read Prior Equal Record We will see more about each of these later in this article. Before that, We will see a bit about SETLL/SETGT .  SETLL (Set Lower Limit) SETLL accepts Key Fields or Relative Record Number (RRN) as Search Arguments and positions the file at the Corresponding Record (or Next Record if exact match isn't found).  SETGT (Set Greater Than) SETGT accepts Key Fields or Relative Record Number (RRN) as Search Arguments and positions the file at the Next Record (Greater Than the Key value). Syntax: SETLL SEARCH-ARGUMENTS/KEYFIELDS FILENAME SETGT  SEARCH-ARGUMENTS/KEYFIELDS FILENAME One of the below can be passed as Search Arguments. Key Fiel

What we need to know about CHAIN (RPGLE) & How is it different from READ?

CHAIN READ & CHAIN, These are one of the most used (& useful) Opcodes by any RPG developer. These Opcodes are used to read a record from file. So, What's the difference between CHAIN & READ?   CHAIN operation retrieves a record based on the Key specified. It's more like Retrieving Random record from a Database file based on the Key fields.  READ operation reads the record currently pointed to from a Database file. There are multiple Opcodes that start with READ and all are used to read a record but with slight difference. We will see more about different Opcodes and How they are different from each other (and CHAIN) in another article. Few differences to note.  CHAIN requires Key fields to read a record where as READ would read the record currently pointed to (SETLL or SETGT are used to point a Record).  If there are multiple records with the same Key data, CHAIN would return the same record every time. READE can be used to read all the records with the specified Ke

Extract a portion of a Date/Time/Timestamp in RPGLE - IBM i

%SUBDT Extracting Year, Month, Day, Hour, Minutes, Seconds or Milli seconds of a given Date/Time/Timestamp is required most of the times.  This can be extracted easily by using %SUBDT. BIF name looks more similar to %SUBST which is used to extract a portion of string by passing from and two positions of the original string. Instead, We would need to pass a value (i.e., Date, Time or Timestamp ) and Unit (i.e., *YEARS, *MONTHS, *DAYS, *HOURS, *MINUTES, *SECONDS or *MSECONDS) to %SUBDT.  Valid unit should be passed for the type of the value passed. Below are the valid values for each type. Date - *DAYS, *MONTHS, *YEARS Time - *HOURS, *MINUTES, *SECONDS Timestamp - *DAYS, *MONTHS, *YEARS, *HOURS, *MINUTES, *SECONDS, *MSECONDS Syntax: %SUBDT(value : unit { : digits { : decpos} }) Value and Unit are the mandatory arguments.  Digits and Decimal positions are optional and can only be used with *SECONDS for Timestamp. We can either pass the full form for the unit or use the short form. Below i