How to Analyze Company Productivity in Python Using NumPy, OOP, and File Handling

A full walkthrough of a simple but powerful data-processing script
If you’re learning Python and you want a practical project that brings together NumPy, file handling, object-oriented programming, and data visualization, this little productivity-analysis script is a perfect example.
It reads data from a text file, processes employee productivity numbers across companies, finds the best and worst performers, computes salaries, and even plots employee productivity using Matplotlib.
Let’s break this entire program down step by step and understand what every part is doing.
The Data: What Are We Working With?
The file company.txt contains rows of numbers.
Each row represents one company.
Each number represents how many products a single employee produced.
Example rows:
52,43,56,47,58,41,50,44,59,45
43,50,38,54,46,36,52,42,51,37,49,40
23,27,21,29,26,24,28,25,22,20,30,23,26,21,29
42,56,71,35,63,77
43,58,52,46,55,41,50,59,47,54
Different companies have different numbers of employees.
Reading and Cleaning the File
Let’s start with file_handling():
def file_handling():
lines = []
with open("company.txt", "r") as file:
for line in file:
values = line.strip().split(",")
int_values = [int(val) for val in values]
lines.append(int_values)
data_frame = np.array([np.array(row) for row in lines], dtype="object")
for row in data_frame:
for i in row:
print(type(i))
return data_frame
What’s happening here?
- The file is opened.
- Each line is split by commas.
- Every value is converted into an integer.
- All rows are stored inside a NumPy object array (since the rows have different lengths).
- The function prints the type of each element, just for debugging.
- Finally, it returns a NumPy array where each element is one company’s data.
In simple terms:
It turns raw text into clean numerical data.
Calculating Productivity
def productivity_of_company(order, data_frame):
return np.sum(data_frame[order])
This function takes a company index and returns the total number of products their employees made.
Finding the Most Productive Company
def max_productivity(data_frame):
i = 0
best_company = i + 1
num_products = 0
for i in range(len(data_frame)):
result = productivity_of_company(i, data_frame)
if result > num_products:
num_products = result
best_company = i + 1
print(f"The best company is the {best_company}. company with {num_products}")
This loops through all companies, sums their output, and keeps track of the highest one.
Important details:
- Companies are printed starting at 1, not 0.
- num_products tracks the maximum productivity found so far.
- If a company produces more, it replaces the current leader.
Finding the Worst Company
def min_productivity(data_frame):
i = 0
worst_company = i + 1
num_of_products = productivity_of_company(0, data_frame)
for i in range(len(data_frame)):
result = productivity_of_company(i, data_frame)
if result <= num_of_products:
num_of_products = result
worst_company = i + 1
print(f"The worst company is the {worst_company}. company with {num_of_products}")
Very similar logic, but this time the code tracks the smallest productivity value instead of the largest.
Calculating the Mean Productivity of Each Company
def mean_products(data_frame):
for i in range(len(data_frame)):
average = np.mean(data_frame[i])
print(f"On average, an employee from {i}. company produced {average} products")
This prints the average monthly output per employee for each company.
The IntArray Class: A Small OOP Wrapper
This part of the code introduces classes, giving structure to how arrays are handled.
class IntArray:
def __init__(self, int_array):
if not isinstance(int_array, np.ndarray) or int_array.dtype != int:
raise ValueError("Input must be a numpy array of integers")
self.int_array = int_array
The constructor ensures that the data passed to the class is:
- a NumPy array
- containing only integers
This protects the class from invalid input.
Displaying the Array
def display(self):
print(self.int_array)
Simply prints the array of productivity numbers.
Salary Calculation
def salary(self):
array_shape = self.int_array.shape
money_per_product = np.full(array_shape, 7)
salaries = self.int_array * money_per_product
print(f"People made {self.int_array} products and this is their salaries {salaries}")
Assuming:
- every employee earns 7 units of money per product
The code multiplies each employee’s output by 7 and prints the resulting salary list.
Plotting Employee Productivity
def show_data(self):
x = np.arange(len(self.int_array))
plt.plot(x, self.int_array, marker='o')
plt.title("Productivity of employees")
plt.xlabel("rank of employees")
plt.ylabel("products/month")
plt.xticks(x)
plt.grid()
plt.show()
This draws a line plot showing each employee’s productivity visually.
Breakdown:
- x-axis: employee rank (0, 1, 2…)
- y-axis: number of products produced
- grid and markers help readability
This turns raw numbers into something you can visually interpret.
main()
def main():
data_frame = file_handling()
print(data_frame)
first_branch = IntArray(data_frame[0])
first_branch.display()
first_branch.salary()
first_branch.show_data()
max_productivity(data_frame)
min_productivity(data_frame)
Here’s what happens:
- Data is loaded from the file.
- The first company’s data is wrapped inside an IntArray object.
The program:
- prints the productivity numbers
- calculates salaries
- visualizes employee productivity
Finally, the program identifies:
- the company with the highest total production
- the company with the lowest total production
This is a full data analysis pipeline wrapped in a single script.
Conclusion
This project pulls together several core Python skills:
- file reading and parsing
- using NumPy arrays for efficient numerical operations
- implementing object-oriented design
- performing analysis with aggregation functions
- creating data visualizations
- working with real-world style datasets
It’s a great example of how separate Python concepts come together in a practical, understandable way. If you can follow and rebuild this script, you’re well on your way to handling more complex data-driven Python applications.
To check the implementation clone this repo
How to Analyze Company Productivity in Python Using NumPy, OOP, and File Handling was originally published in Javarevisited on Medium, where people are continuing the conversation by highlighting and responding to this story.
This post first appeared on Read More

