Using python for really fast file operations

Use the OS library will help you accelerate your work-flow.

Python os module can be used to perform tasks such as finding the name of present working directory, changing current working directory, checking if certain files or directories exist at a location, creating new directories, deleting existing files or directories, walking through a directory and performing operations on every file in the directory that satisfies some user-defined criteria, and a lot more.

Step 1: Download the chirps-v2.0 data for the year 2000 from here.

Step 2: Extract all data into the same folder.

Step 3: Create the script8a.py with the code below and run it

import os

# define directory
directory = "/path/to/chirps/"

# list files in folder
listfiles = os.listdir(directory)

# print file list
print listfiles

Question 8.1: what is the data structure has the listfiles variable?

Assignment 8.1: Print the names of the files line by line. Use the script below.

def printlistitems(mylist):
	for files in mylist:
		print files

The name of a file can be split based on a single character.

Step 4: Create the script8b.py with the code below and run it

import os
from osgeo import gdal
import matplotlib.pyplot as plt

# define directory
directory = "/path/to/chirps/"

# list files in folder
listfiles = os.listdir(directory)

for files in listfiles:
	print files.split(".")

This can be used to extract important information.

import os
from osgeo import gdal
import matplotlib.pyplot as plt

# define directory
directory = "/path/to/chirps/"

# list files in folder
listfiles = os.listdir(directory)

for files in listfiles:
	splitfile = files.split(".")
	print files, "contains the precipitation data of", splitfile[3], splitfile[2]

Create the script below to print the name of the month.

import os
from osgeo import gdal
import calendar
import matplotlib.pyplot as plt

# define directory
directory = "/path/to/chirps/"

# list files in folder
listfiles = os.listdir(directory)

for files in listfiles:
	splitfile = files.split(".")
	print files, "contains the precipitation data of", calendar.month_name[(int(splitfile[3]))], splitfile[2]

Many options are available, such as writing the information to a text file. Note that “\n” stands for a newline.

import os
from osgeo import gdal
import calendar
import matplotlib.pyplot as plt

# define directory
directory = "/path/to/chirps/"

# list files in folder
listfiles = os.listdir(directory)

# opening the text file
myfile = open("/path/to/newfile.txt", "w")

for files in listfiles:
	splitfile = files.split(".")
	myfile.write(files + " contains the precipitation data of " + calendar.month_name[(int(splitfile[3]))] + " "  + splitfile[2] + "\n")

myfile.close()

Leave a Reply

Fill in your details below or click an icon to log in:

WordPress.com Logo

You are commenting using your WordPress.com account. Log Out /  Change )

Facebook photo

You are commenting using your Facebook account. Log Out /  Change )

Connecting to %s