- 20th Jul 2024
- 21:16 pm
- Admin
Write a Python program crawlStars.pyto find the URLs of all images in the Astronomy Picture of the Day Archive (APOD) that have "Star Cluster" in their title. Your program has to save all images that have been taken between 1 January 2020 and 1 May 2022 . The following screenshot shows the information of three pictures in the APOD's webpage. The date of an image can be extracted from either the explicit text, e.g., "2022 April 05", or from the string of the provided hypertext reference attribute (href), e.g., "ap220405.html" (see the highlighted image as an example). The title of the highlighted image is "Seven Sisters versus California".
- 2022 A ril 06: <a href="a 220406. html ">Earende1: A Star in the Earl Universe</a><br>
- 2022 April 05: <a href=" ap220405. html ">Seven Sisters versus California</a><br>
- 2022 April 04: <a href= ap220404 . html >A Vortex Aurora over Iceland< a><br>
To generate the URL of an image for a specific day, you need to concatenate the string referring to the parent directory "https://apod.nasa.gov/apod/" to the string of the href containing the relative address to the pages of each day. For instance, in the highlighted example above, you can access the picture of Seven Sisters versus California page via dtps://apod.nasa.guLapod/ap220405.html. Finally, your image file can be retrieved through finding the "IMG SRC" attribute in the HTML. See the example below for the Seven Sisters versus
Your program will need to save these images in the same folder as crawlStars.py. The name of the image file should be "image_<date>.jpg", where the date is a 6 digit number with yymmdd format. For the above example the name of the file that is saved would be image_220405.jpg. You are only allowed to use Python built-in functions and modules as well as re and urllib libraries for this task.
Free Assignment Solution - Crawl and Download "Star Cluster" Images Using Python
import numpy as np
from PIL import Image
'''
This function takes the name of the image file,
text file formed in 3.1.2, the p value (which is the point provided by user) and
K value as the parameter
It first collects all the points (as an array) present inside the text file and then
simply sort the array based on the distance of the point of star to the given point
and collects first K values to print the result (closest K points)
'''
def clSt(img_file, txt_file, p, K):
im = Image.open(img_file).convert('RGB')
im = np.array(im)
try:
x = im[p[0], p[1]]
except:
p = (0, 0)
with open(txt_file, 'r') as f:
points = []
for i, line in enumerate(f.readlines()):
if i==0:
N = int(line)
else:
s = line.split(',')
points.append([int(s[0]), int(s[1])])
points.sort(key = lambda x: ((x[0]-p[0])**2 + (x[1]-p[1])**2))
result = points[:K].copy()
for i, x in enumerate(result):
d = round(np.sqrt(((x[0]-p[0])**2 + (x[1]-p[1])**2)), 2)
print(i+1, '- closest star is located at (' + str(x[0]) + ', ' +
str(x[1]) + ')', 'with a distance of', d)
Get the best Crawl and Download "Star Cluster" Images Using Python assignment help and tutoring services from our experts now!
About The Author - Alex Johnson
Alex Johnson is an experienced software developer with a strong focus on web scraping and data extraction using Python. Specializing in Python’s built-in libraries, such as re
and urllib
, Alex has developed efficient scripts to retrieve and process data from various online sources. His recent work includes designing a Python program to crawl the Astronomy Picture of the Day Archive, extracting and saving images with specific titles, and applying date filters to manage and organize data effectively.