How to cleanup JFrog Artifactory artifacts using Python script
Here in this article we will see how we can approach to clean a JFrog Artifactory with artifacts which haven’t been used or modified than some defined period of time which can be specific to an organizations as per the policies. We will be using the Python as a language along with some popular modules like requests and json to achieve our task in this article.
Test Environment
Fedora 32 installed
Python3 installed
Every project which follows a DevOps CICD pipeline to build there projects use some tools for their source code management and artifacts management. As CICD pipeline helps in building code, testing and releasing it for production in short duration, there are multiple builds and testing happening before the actual artifact gets ready for production deployment. In this process of multiple builds which are pushed to an artifactory repository they get accumulated in long time. At some stage, it would be required for the organizations to clean up their artifactory repository to get rid of old unused artifactory builds or tags and free up space on the artifactory server.
So, lets get started to see how we can clean JFrog Artifactory artifacts using Python script.
Procedure
Step1: Import the required libraries
Here we are going to use the below four libraries for our artifacts clean up task. The ‘requests’ module is used to send HTTP/HTTPS requests onto the JFrog Artifactory to fetch details about the repositories and artifacts. Once the required details are fetched we are converting the response into JSON format using the ‘json’ module to filter the required fields from the JSON response. The datetime module is used to capture the current datetime and format the datetime from the JSON response for the artifacts last modified response which we will use further to set a condition for artifacts deletion.
import requests
import json
import datetime
import sys
import dateutil.parser
Step2: Set the Environment variables
Here we are setting the below environment variable which capture the artifactory url, header we want to pass as a part of request, binary data that we want to pass as a part of the request. We are also setting the username and password which will be passed as auth data to the HTTP request along with the current_date and days_older variable for preparing our clean condition.
url = 'https://jfrog_server_fqdn/artifactory'
headers = {"content-type": "text/plain"}
data = 'items.find({"repo": "repository_name", "name": "manifest.json"})'
username = 'username'
password = 'password'
auth=(username, password)
#current_time = datetime.datetime.now()
current_date = datetime.date.today()
#current_date = datetime.datetime.utcnow().isoformat()
days_older = 730
Step3: Function to list repositories and tags
In this function we are passing the artifactory url, headers, binary data and authentication data which will be sent as a part of HTTP POST request. Also please note the REST API ‘/api/search/aql’ which we are appending to the artifactory url which is a search api using the artifactory query language we are using to search the artifactory for artifacts consisting of manifest.json file based on the binary data find condition.
Once we have the response the AQL search api query we are using that response to convert to JSON format using the ‘json’ module and fetch the results field which consist of multiple artifacts with all the details. From this result details we are fetching the repository name, path and last modified fields to further use them for the artifacts cleanup.
Also we are preparing a new url named artifact_url which consist of the artifactory along with repository name and path appended to frame the complete url for the respective artifact tag.
def list_repo_tags(url, headers, data, auth):
response = requests.post(url+'/api/search/aql', headers=headers, data=data, auth=auth)
#print(response.status_code)
if response.status_code > 300:
print("Unable to search artifacts in artifactory repository. Exiting")
sys.exit(1)
#print(response.text)
#print(response.json())
json_data = json.loads(response.text)
tags = json_data["results"]
for eachItem in tags:
repo = eachItem['repo']
path = eachItem['path']
last_modified_date = eachItem['modified']
#print('{}, {}, {}'.format(repo, path, last_modified))
artifact_url = url+"/"+eachItem['repo']+"/"+eachItem['path']
#print(artifact_url)
delete_repo_tags(artifact_url, last_modified_date, current_date, days_older, auth)
Step4: Delete the artifacts older then days_older value
Here in this function we are using the prepared artifact_url and we formatting the current_date and formatted_date using the datetime module to only get the date in format “%Y-%m-%d”. Once we have the current_date and formatted_date in the mentioned format, we are taking a difference of these two date to find the delta number of days between them. If this delta number of days is greater than the days_older value we are taking that artifact into consideration for deletion using the requests.delete as shown below.
I have commented our the delete request as it would delete the artifacts from the repository. Please make sure that you test this code in your test environment before using it in production environment.
def delete_repo_tags(artifact_url, last_modified_date, current_date, days_older, auth):
formated_date = dateutil.parser.isoparse(last_modified_date).date()
#print(artifact_url)
#print(formated_date)
#print(current_date)
date_format = "%Y-%m-%d"
x = datetime.datetime.strptime(str(formated_date), date_format)
y = datetime.datetime.strptime(str(current_date), date_format)
num_of_days = (y - x).days
#print(num_of_days)
if num_of_days > days_older:
print(artifact_url)
print(formated_date)
print(current_date)
print(num_of_days)
# requests.delete(artifact_url, auth=auth)
Step5: Calling the list_repo_tags function
Now, we have all our environment variables and function definition ready, lets call our list_repo_tags function from where the execution would be started.
list_repo_tags(url, headers, data, auth)
Step6: Complete Code
Here the complete code for your reference.
[admin@fedser32 rsk-docker]$ cat clean_up_test.py
#!/usr/bin/env python
import requests
import json
import datetime
import sys
import dateutil.parser
### Environment variables
url = 'https://jfrog_server_fqdn/artifactory'
headers = {"content-type": "text/plain"}
data = 'items.find({"repo": "repository_name", "name": "manifest.json"})'
username = 'username'
password = 'password'
auth=(username, password)
#current_time = datetime.datetime.now()
current_date = datetime.date.today()
#current_date = datetime.datetime.utcnow().isoformat()
days_older = 730
#print('"current_time : {}"'.format(current_time))
#print('"current_date : {}"'.format(current_date))
### Delete repository tags older than days_older
def delete_repo_tags(artifact_url, last_modified_date, current_date, days_older, auth):
formated_date = dateutil.parser.isoparse(last_modified_date).date()
#print(artifact_url)
#print(formated_date)
#print(current_date)
date_format = "%Y-%m-%d"
x = datetime.datetime.strptime(str(formated_date), date_format)
y = datetime.datetime.strptime(str(current_date), date_format)
num_of_days = (y - x).days
#print(num_of_days)
if num_of_days > days_older:
print(artifact_url)
print(formated_date)
print(current_date)
print(num_of_days)
# print(requests.delete(artifact_url, auth=auth))
### List repository tags
def list_repo_tags(url, headers, data, auth):
response = requests.post(url+'/api/search/aql', headers=headers, data=data, auth=auth)
if response.status_code > 300:
print("Unable to search artifacts in artifactory repository. Exiting")
sys.exit(1)
#print(response.text)
#print(response.json())
json_data = json.loads(response.text)
tags = json_data["results"]
for eachItem in tags:
repo = eachItem['repo']
path = eachItem['path']
last_modified_date = eachItem['modified']
#print('{}, {}, {}'.format(repo, path, last_modified))
artifact_url = url+"/"+eachItem['repo']+"/"+eachItem['path']
#print(artifact_url)
delete_repo_tags(artifact_url, last_modified_date, current_date, days_older, auth)
list_repo_tags(url, headers, data, auth)
Hope you enjoyed reading this article. Thank you..
2 COMMENTS