How to Analyze Company Productivity in Python Using NumPy, OOP, and File Handling

A full walkthrough of a simple but powerful data-processing script

If you’re learning Python and you want a practical project that brings together NumPy, file handling, object-oriented programming, and data visualization, this little productivity-analysis script is a perfect example.

It reads data from a text file, processes employee productivity numbers across companies, finds the best and worst performers, computes salaries, and even plots employee productivity using Matplotlib.

Let’s break this entire program down step by step and understand what every part is doing.

The Data: What Are We Working With?

The file company.txt contains rows of numbers.
Each row represents one company.
Each number represents how many products a single employee produced.

Example rows:

52,43,56,47,58,41,50,44,59,45
43,50,38,54,46,36,52,42,51,37,49,40
23,27,21,29,26,24,28,25,22,20,30,23,26,21,29
42,56,71,35,63,77
43,58,52,46,55,41,50,59,47,54

Different companies have different numbers of employees.

Reading and Cleaning the File

Let’s start with file_handling():

def file_handling():
    lines = []

    with open("company.txt", "r") as file:
        for line in file:
            values = line.strip().split(",")
            int_values = [int(val) for val in values]
            lines.append(int_values)
        data_frame = np.array([np.array(row) for row in lines], dtype="object")
        for row in data_frame:
            for i in row:
                print(type(i))
        return data_frame

What’s happening here?

The file is opened.
Each line is split by commas.
Every value is converted into an integer.
All rows are stored inside a NumPy object array (since the rows have different lengths).
The function prints the type of each element, just for debugging.
Finally, it returns a NumPy array where each element is one company’s data.

In simple terms:
It turns raw text into clean numerical data.

Calculating Productivity

def productivity_of_company(order, data_frame):
    return np.sum(data_frame[order])

This function takes a company index and returns the total number of products their employees made.

Finding the Most Productive Company

def max_productivity(data_frame):
    i = 0
    best_company = i + 1
    num_products = 0

    for i in range(len(data_frame)):
        result = productivity_of_company(i, data_frame)
        if result > num_products:
            num_products = result
            best_company = i + 1
    print(f"The best company is the {best_company}. company with {num_products}")

This loops through all companies, sums their output, and keeps track of the highest one.

Important details:

Companies are printed starting at 1, not 0.
num_products tracks the maximum productivity found so far.
If a company produces more, it replaces the current leader.

Finding the Worst Company

def min_productivity(data_frame):
    i = 0
    worst_company = i + 1
    num_of_products = productivity_of_company(0, data_frame)

    for i in range(len(data_frame)):
        result = productivity_of_company(i, data_frame)
        if result <= num_of_products:
            num_of_products = result
            worst_company = i + 1
    print(f"The worst company is the {worst_company}. company with {num_of_products}")

Very similar logic, but this time the code tracks the smallest productivity value instead of the largest.

Calculating the Mean Productivity of Each Company

def mean_products(data_frame):
    for i in range(len(data_frame)):
        average = np.mean(data_frame[i])
        print(f"On average, an employee from {i}. company produced {average} products")

This prints the average monthly output per employee for each company.

The IntArray Class: A Small OOP Wrapper

This part of the code introduces classes, giving structure to how arrays are handled.

class IntArray:

  def __init__(self, int_array):
        if not isinstance(int_array, np.ndarray) or int_array.dtype != int:
            raise ValueError("Input must be a numpy array of integers")
        self.int_array = int_array

The constructor ensures that the data passed to the class is:

a NumPy array
containing only integers

This protects the class from invalid input.

Displaying the Array

def display(self):
    print(self.int_array)

Simply prints the array of productivity numbers.

Salary Calculation

def salary(self):
    array_shape = self.int_array.shape
    money_per_product = np.full(array_shape, 7)
    salaries = self.int_array * money_per_product
    print(f"People made {self.int_array} products and this is their salaries {salaries}")

Assuming:

every employee earns 7 units of money per product

The code multiplies each employee’s output by 7 and prints the resulting salary list.

Plotting Employee Productivity

def show_data(self):
    x = np.arange(len(self.int_array))
    plt.plot(x, self.int_array, marker='o')
    plt.title("Productivity of employees")
    plt.xlabel("rank of employees")
    plt.ylabel("products/month")
    plt.xticks(x)
    plt.grid()
    plt.show()

This draws a line plot showing each employee’s productivity visually.

Breakdown:

x-axis: employee rank (0, 1, 2…)
y-axis: number of products produced
grid and markers help readability

This turns raw numbers into something you can visually interpret.

main()

def main():
    data_frame = file_handling()
    print(data_frame)
    first_branch = IntArray(data_frame[0])
    first_branch.display()
    first_branch.salary()
    first_branch.show_data()
    max_productivity(data_frame)
    min_productivity(data_frame)

Here’s what happens:

Data is loaded from the file.
The first company’s data is wrapped inside an IntArray object.

The program:

prints the productivity numbers
calculates salaries
visualizes employee productivity

Finally, the program identifies:

the company with the highest total production
the company with the lowest total production

This is a full data analysis pipeline wrapped in a single script.

Conclusion

This project pulls together several core Python skills:

file reading and parsing
using NumPy arrays for efficient numerical operations
implementing object-oriented design
performing analysis with aggregation functions
creating data visualizations
working with real-world style datasets

It’s a great example of how separate Python concepts come together in a practical, understandable way. If you can follow and rebuild this script, you’re well on your way to handling more complex data-driven Python applications.

To check the implementation clone this repo

How to Analyze Company Productivity in Python Using NumPy, OOP, and File Handling was originally published in Javarevisited on Medium, where people are continuing the conversation by highlighting and responding to this story.

This post first appeared on Read More