Kodeclik Blog
How to Sort Rows in Python
Sorting rows in Python is useful when you need to organize tabular data, arrange records by a key value, or prepare data for further processing. This operation appears frequently in data science, analytics, and even simple Python scripts handling structured lists.
How are rows stored in Python?
In Python, rows of data can be stored in different formats depending on the context:
Method 1: Sorting rows in a list of lists
When rows are in a basic list of lists, you can sort them using the built-in sorted() function with a lambda that specifies the sorting key.
Suppose you have a list of students stored as a list of lists, where each row contains [id, name, address, gpa].
# Example student data as list of lists
students = [
[101, "Alice", "New York", 3.7],
[105, "Bob", "Chicago", 3.2],
[103, "Charlie", "Boston", 3.9],
[102, "Diana", "Atlanta", 3.5]
]
# Sort by GPA (last column)
by_gpa = sorted(students, key=lambda row: row[3])
print("Sorted by GPA:", by_gpa)
# Sort by Name (second column)
by_name = sorted(students, key=lambda row: row[1])
print("Sorted by Name:", by_name)
# Sort by ID (first column)
by_id = sorted(students, key=lambda row: row[0])
print("Sorted by ID:", by_id)
Note that column IDs start with zero, hence the mapping between the index used in the lambda function and the value by which you are sorting. The output will look like:
Sorted by GPA: [[105, 'Bob', 'Chicago', 3.2], [102, 'Diana', 'Atlanta', 3.5], [101, 'Alice', 'New York', 3.7], [103, 'Charlie', 'Boston', 3.9]]
Sorted by Name: [[101, 'Alice', 'New York', 3.7], [105, 'Bob', 'Chicago', 3.2], [103, 'Charlie', 'Boston', 3.9], [102, 'Diana', 'Atlanta', 3.5]]
Sorted by ID: [[101, 'Alice', 'New York', 3.7], [102, 'Diana', 'Atlanta', 3.5], [103, 'Charlie', 'Boston', 3.9], [105, 'Bob', 'Chicago', 3.2]]
This approach is flexible and works well for small data tables, but for larger datasets you may want to use Numpy or Pandas for efficiency.
Method 2: Sorting rows in a Numpy array
Numpy provides efficient array operations, and you can use numpy.ndarray methods or functions like argsort to sort rows based on a column.
When you use Numpy, the data can be stored as a structured array or a plain numeric/text array. Here’s an example with a Numpy array (numeric IDs and GPAs, with string fields).
import numpy as np
# Student data: id, name, address, gpa
students = np.array([
[101, "Alice", "New York", 3.7],
[105, "Bob", "Chicago", 3.2],
[103, "Charlie", "Boston", 3.9],
[102, "Diana", "Atlanta", 3.5]
], dtype=object)
# Sort by GPA (last column)
by_gpa = students[students[:, 3].argsort()]
print("Sorted by GPA:\n", by_gpa, "\n")
# Sort by Name (second column)
by_name = students[students[:, 1].argsort()]
print("Sorted by Name:\n", by_name, "\n")
# Sort by ID (first column, numeric)
by_id = students[students[:, 0].astype(int).argsort()]
print("Sorted by ID:\n", by_id)
The output will be:
Sorted by GPA:
[[105 'Bob' 'Chicago' 3.2]
[102 'Diana' 'Atlanta' 3.5]
[101 'Alice' 'New York' 3.7]
[103 'Charlie' 'Boston' 3.9]]
Sorted by Name:
[[101 'Alice' 'New York' 3.7]
[105 'Bob' 'Chicago' 3.2]
[103 'Charlie' 'Boston' 3.9]
[102 'Diana' 'Atlanta' 3.5]]
Sorted by ID:
[[101 'Alice' 'New York' 3.7]
[102 'Diana' 'Atlanta' 3.5]
[103 'Charlie' 'Boston' 3.9]
[105 'Bob' 'Chicago' 3.2]]
This method is much faster than list-based sorting for larger datasets and is ideal for scientific computing or numerical data workflows.
Method 3: Sorting rows in a Pandas Dataframe
Pandas provides a high-level and readable way to sort datasets by one or multiple columns, making it especially useful for real-world datasets with labels. The primary method is called sort_values(). You can also sort by multiple columns (e.g., GPA then Name).
import pandas as pd
# Example student data
data = {
"id": [101, 105, 103, 102],
"name": ["Alice", "Bob", "Charlie", "Diana"],
"address": ["New York", "Chicago", "Boston", "Atlanta"],
"gpa": [3.7, 3.2, 3.9, 3.5]
}
df = pd.DataFrame(data)
# Sort by GPA
by_gpa = df.sort_values(by="gpa")
print("Sorted by GPA:\n", by_gpa, "\n")
# Sort by Name
by_name = df.sort_values(by="name")
print("Sorted by Name:\n", by_name, "\n")
# Sort by GPA descending, then Name ascending
by_gpa_name = df.sort_values(by=["gpa", "name"], ascending=[False, True])
print("Sorted by GPA (desc) then Name:\n", by_gpa_name)
The output will be:
Sorted by GPA:
id name address gpa
1 105 Bob Chicago 3.2
3 102 Diana Atlanta 3.5
0 101 Alice New York 3.7
2 103 Charlie Boston 3.9
Sorted by Name:
id name address gpa
0 101 Alice New York 3.7
1 105 Bob Chicago 3.2
2 103 Charlie Boston 3.9
3 102 Diana Atlanta 3.5
Sorted by GPA (desc) then Name:
id name address gpa
2 103 Charlie Boston 3.9
0 101 Alice New York 3.7
3 102 Diana Atlanta 3.5
1 105 Bob Chicago 3.2
Sorting with Pandas is highly intuitive and powerful, allowing you to easily handle complex datasets and multiple sorting criteria.
Summary
Sorting rows in Python can be done using plain lists, Numpy arrays, or Pandas Dataframes depending on your needs. Lists with lambda functions provide flexibility for small datasets, Numpy offers speed for numerical data, and Pandas gives a clean interface for labeled, real-world datasets. Choosing the right method depends on your data size and the complexity of operations.
Enjoy this blogpost? Want to learn Python with us? Sign up for 1:1 or small group classes.