Final submission commit
Some checks failed
ci/woodpecker/push/lint Pipeline failed
ci/woodpecker/push/test Pipeline failed

A few changes:
- Added report
- Added mid project demo content
- Added README.txt
- Added Experiments
- Added ratings.txt file
- Added unit tests
- Simplified the prediction code
- Added single prediction
This commit is contained in:
Oscar Blue 2022-05-14 17:44:05 +01:00
parent 05edb5598a
commit 523f518aa8
117 changed files with 278814 additions and 178 deletions

View file

@ -1,12 +0,0 @@
# Autophotographer
Autophotographer is a tool that helps users filter the best images from a video.
```
.
├── docs Report and Project proposal
├── img images for the repository
├── README.md
├── src project source code
└── terraform terraform IaC
```

78
README.txt Normal file
View file

@ -0,0 +1,78 @@
Autophotographer is a tool that helps users filter the best aesthetic quality images from a video.
.
├── data/ Generated data
├── docker-compose.yaml Docker-compose file for quick configuration of mount points
├── Dockerfile.amd-gpu Dockerfile for AMD GPUs
├── Dockerfile.cpu Dockerfile for CPU
├── Dockerfile.nvidia-gpu Dockerfile for Nvidia GPUs
├── docs/ Report and project proposal documents
├── environment.yml Anaconda environment dependencies
├── .git Git configuration
├── .gitattributes Git LFS tracking file
├── .gitignore Git ignore file
├── img/ Repository images
├── pyproject.toml Project's package build system requirements and information
├── README.txt Information about the project's structure
├── requirements_dev.txt Python package requirements for testing
├── requirements.txt Python package requirements
├── setup.cfg Python package configuration file
├── setup.py Python package configuration file
├── src/ Project source code
├── terraform/ Terraform IaC
├── test/ Tests
└── .woodpecker/ Pipeline declarations for WoodpeckerCI
data/
└── ratings.txt Single-value calculated ratings for AVA dataset
terraform/
├── google Terraform IaC for google cloud compute
├── linode Terraform IaC for linode cloud platform
tests/
├── integration/ Integrations tests
├── pytest.ini Pytest configuration file
└── unit/ Unit tests
.woodpecker/
├── .lint.yml Linting pipeline
└── .test.yml Testing pipeline
src/
├── autophotographer/ Main project source code
├── experiments/ Experiments to test the project
├── __init__.py
├── output CNN model output and training graphs
src/autophotographer/
├── autophotographer_main.py Code for main pipeline
├── cnn/ CNN code
├── config.yml configuration file for autophotographer_main.py
├── filters/ source code for filters (only some were extracted here)
├── __init__.py
├── predict.py Predict and graph a random batch of AVA images
src/autophotographer/cnn/
├── config.py Config python file holding variables like batch size and validation split
├── dataframe.csv Exported dataframe of AVA image path locations, ratings and IDs
├── dataset.py Functions for working with the AVA dataset
├── export/ Exported tensors (might be removed to meet the file size requirements of submission)
├── __init__.py
├── model.py CNN training script
├── predict.py Use to predict a rating given an image and a model
└── __pycache__
To use the docker container environment (which includes all dependencies):
In the project root type:
"docker-compose build" to build the docker file
"docker-compose run autophotographer" to run and exec into the environment
inside the container run "pip install -e ." to install the package
you can then run "python /mmp-osp1/src/autophotographer/autophotographer_main.py -i path_to_input_video_or_images" to run the pipeline.
If using docker, make sure to check the docker-compose file to configure the mount points of the project. The reason why mount points were use instead of copying the source into the container on build was to reduce image sizes and speed up development.
Downloading the AVA dataset:
The dataset used in the project was downloaded from here: https://mega.nz/folder/9b520Lzb#2gIa1fgAzr677dcHKxjmtQ
This was featured on a GitHub page (https://github.com/imfing/ava_downloader) which also includes a script to scrape the images and generate the dataset yourself or there's also a torrent link.

255530
data/ratings.txt Normal file

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1 @@
<mxfile host="app.diagrams.net" modified="2022-03-22T14:31:32.334Z" agent="5.0 (X11)" etag="Fg5dJ9IUfPYWRN-JqWZQ" version="16.6.4" type="device"><diagram id="Vcx6PXM0HU997HkytNiH" name="Page-1">7VpZc6M4EP41fhyK08dj7CS7U5VJspPamc3TlAwy1kQgSojYnl+/LRBgWSR2NnbwVg2plOnWgdRfqy8YeLNk/QdH2fILizAduHa0HniXA9d17WAMP5KzqTij8ahixJxEFctpGQ/kF1ZMW3ELEuFc6ygYo4JkOjNkaYpDofEQ52yld1swqj81QzE2GA8hoib3O4nEsuKOA7vl/4lJvKyf7NiqJUF1Z8XIlyhiqy2WdzXwZpwxUd0l6xmmUni1XKpx1y+0NgvjOBWHDLi/ffzOH//+uoqL0eYv7/56TaafPLU2sak3jCPYvyIZF0sWsxTRq5Y75axIIyxntYFq+9wwlgHTAeZPLMRGgYkKwYC1FAlVrbBgvvlHjreCmnxU05XE5VqjNoqq1ioX+KIIFCtnBQ/xK/uuVQnxGItX+rkNUKDhmCUY1gPjOKZIkGd9HUipWtz0a9GAGwXIG8BR8z4jWqgnfcV5ikWtfFuo6ZislkTghwyVIljBwdTlr6bFXOD165I0d14PmKg1qGPteYpetYfEqTV/uXVAhvaJhOUawhq4QwpPnUbkGW5jeXsNc9s3aIP5wLuo2+FxW106Rn1Os0LAQDjb9sGD7gqhRtWNc77bfXeSMwPVd/oGNejDPB3RzPgHmpkXcPkYM+MbJ+dzIr3iuamjG/StjkNDUjd3Dw/lnf0FoxR+LuY5o4UA6dlXEH5wQ4ogD6GLKhecPeEZo9Dbu0xZKrV4QSjdYSFK4hTIEOQnDdhUSpdAqHKhGhISReUR6MJGR+8I8Hi78Iw64Bl1wOOdCp6xAc9dJkgC5xxgsCuYLiKUHIDJ4bLmOCe/0LycSko2YyQV5c6C6SC4lHOBhckrY/MWuI8Ake/ueGm7AyKnAyL3VBB54z4sehNwfrIt2/G1mNMa7Yk6S+oecwISkKpw7FDUdQ90El7Qp5eol7l1umYsleopmZCksTI1kzkWRwmkbO9yH02yJPvmT1iEy+NZrsYqbWp/EfTsWFzTBxvyi0GA2Yu7V/mvMkSD5gS/wZ7rQgnGHUKxO4TSMI+f/5gqVyqalWS+IZ79CpXJlmQdy0KFNUc5Ca2IhUVSykezwAPXWwTyT/JZKrb41WWYcWgZlte27jodfmPOhGDJoXr8irKYOPYIlJmpGvhohxjlWVWsWZC1NAmV18T86hlXztOwARHKl43x2Bc8UTTH9B6criCs04vf7HRoUNmFS0gPY0YDO8pil1eHliV5iLAV4xRzRC1lJY+Ne1Nrs2T1qbkcXzvPTmCZBzqYWKNxh507mRPpJaFr602652+KTx9QbzrUybu9Vpy8nuuBveHjTg4NwnrFp16mVhKMilCmmzmWkViRkzSWFEkyKtkZZyHOK+5pIrJtYxjhBSpoW816V5wWOOcWp3lmRPIVpU8l/gkiqZK9hgSC/9nt7f9O+sOzi5I9s/5iSPXkUfJumDzskIrfFX35JxNLsF8sZxod//dgqNKFswqCvd8wGJm2fli63id8MEgHJNpnClJEOJiuKl+BeEYKcLqgJPt2ahB3iohu/yD+Nnj7qjddr2M/GCTTWZeFQcf6mZmh6JnCdYLqzR4cfR1Hv3cc6+N+cG0HTBMlKci2/iDHNnGAv2u5hinAGRHctnW8DtHqK13VHq00dKwKz5byybWr3BXSeEWrzb+lmKVpMFrlvgXhqICsAfMfHMcENrf5QcpXrzsB/8m0rf46qwlmDW0b+qay1bzjv4l2DNU64/JDLhAXF/Ibs1YXS941kdtWI6K6R0gRpOFhxVRdyseviag+hppMhoquFuApqn2+JDZbxAlfS9X2e3/FatKtYu+tiADZfhxXtm19Yuhd/Qs=</diagram></mxfile>

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 73 KiB

303
docs/report/.gitignore vendored Normal file
View file

@ -0,0 +1,303 @@
## Core latex/pdflatex auxiliary files:
*.aux
*.lof
*.log
*.lot
*.fls
*.out
*.toc
*.fmt
*.fot
*.cb
*.cb2
.*.lb
## Intermediate documents:
*.dvi
*.xdv
*-converted-to.*
# these rules might exclude image files for figures etc.
# *.ps
# *.eps
# *.pdf
## Generated if empty string is given at "Please type another file name for output:"
.pdf
## Bibliography auxiliary files (bibtex/biblatex/biber):
*.bbl
*.bcf
*.blg
*-blx.aux
*-blx.bib
*.run.xml
## Build tool auxiliary files:
*.fdb_latexmk
*.synctex
*.synctex(busy)
*.synctex.gz
*.synctex.gz(busy)
*.pdfsync
## Build tool directories for auxiliary files
# latexrun
latex.out/
## Auxiliary and intermediate files from other packages:
# algorithms
*.alg
*.loa
# achemso
acs-*.bib
# amsthm
*.thm
# beamer
*.nav
*.pre
*.snm
*.vrb
# changes
*.soc
# comment
*.cut
# cprotect
*.cpt
# elsarticle (documentclass of Elsevier journals)
*.spl
# endnotes
*.ent
# fixme
*.lox
# feynmf/feynmp
*.mf
*.mp
*.t[1-9]
*.t[1-9][0-9]
*.tfm
#(r)(e)ledmac/(r)(e)ledpar
*.end
*.?end
*.[1-9]
*.[1-9][0-9]
*.[1-9][0-9][0-9]
*.[1-9]R
*.[1-9][0-9]R
*.[1-9][0-9][0-9]R
*.eledsec[1-9]
*.eledsec[1-9]R
*.eledsec[1-9][0-9]
*.eledsec[1-9][0-9]R
*.eledsec[1-9][0-9][0-9]
*.eledsec[1-9][0-9][0-9]R
# glossaries
*.acn
*.acr
*.glg
*.glo
*.gls
*.glsdefs
*.lzo
*.lzs
*.slg
*.slo
*.sls
# uncomment this for glossaries-extra (will ignore makeindex's style files!)
# *.ist
# gnuplot
*.gnuplot
*.table
# gnuplottex
*-gnuplottex-*
# gregoriotex
*.gaux
*.glog
*.gtex
# htlatex
*.4ct
*.4tc
*.idv
*.lg
*.trc
*.xref
# hyperref
*.brf
# knitr
*-concordance.tex
# TODO Uncomment the next line if you use knitr and want to ignore its generated tikz files
# *.tikz
*-tikzDictionary
# listings
*.lol
# luatexja-ruby
*.ltjruby
# makeidx
*.idx
*.ilg
*.ind
# minitoc
*.maf
*.mlf
*.mlt
*.mtc[0-9]*
*.slf[0-9]*
*.slt[0-9]*
*.stc[0-9]*
# minted
_minted*
*.pyg
# morewrites
*.mw
# newpax
*.newpax
# nomencl
*.nlg
*.nlo
*.nls
# pax
*.pax
# pdfpcnotes
*.pdfpc
# sagetex
*.sagetex.sage
*.sagetex.py
*.sagetex.scmd
# scrwfile
*.wrt
# svg
svg-inkscape/
# sympy
*.sout
*.sympy
sympy-plots-for-*.tex/
# pdfcomment
*.upa
*.upb
# pythontex
*.pytxcode
pythontex-files-*/
# tcolorbox
*.listing
# thmtools
*.loe
# TikZ & PGF
*.dpth
*.md5
*.auxlock
# titletoc
*.ptc
# todonotes
*.tdo
# vhistory
*.hst
*.ver
# easy-todo
*.lod
# xcolor
*.xcp
# xmpincl
*.xmpi
# xindy
*.xdy
# xypic precompiled matrices and outlines
*.xyc
*.xyd
# endfloat
*.ttt
*.fff
# Latexian
TSWLatexianTemp*
## Editors:
# WinEdt
*.bak
*.sav
# Texpad
.texpadtmp
# LyX
*.lyx~
# Kile
*.backup
# gummi
.*.swp
# KBibTeX
*~[0-9]*
# TeXnicCenter
*.tps
# auto folder when using emacs and auctex
./auto/*
*.el
# expex forward references with \gathertags
*-tags.tex
# standalone packages
*.sta
# Makeindex log files
*.lpz
# xwatermark package
*.xwm
# REVTeX puts footnotes in the bibliography by default, unless the nofootinbib
# option is specified. Footnotes are the stored in a file with suffix Notes.bib.
# Uncomment the next line to have this generated file ignored.
#*Notes.bib
.sync_*

View file

@ -0,0 +1,34 @@
\chapter{Third-Party Code and Libraries}
%
%If you have made use of any third party code or software libraries, i.e. any code that you have not designed and written yourself, then you must include this appendix.
%
%As has been said in lectures, it is acceptable and likely that you will make use of third-party code and software libraries. If third party code or libraries are used, your work will build on that to produce notable new work. The key requirement is that we understand what your original work is and what work is based on that of other people.
%
%Therefore, you need to clearly state what you have used and where the original material can be found. Also, if you have made any changes to the original versions, you must explain what you have changed.
%
%The following is an example of what you might say.
%
%Apache POI library - The project has been used to read and write Microsoft Excel files (XLS) as part of the interaction with the client's existing system for processing data. Version 3.10-FINAL was used. The library is open source and it is available from the Apache Software Foundation
%\cite{apache_poi}. The library is released using the Apache License
%\cite{apache_license}. This library was used without modification.
%
%Include as many declarations as appropriate for your work. The specific wording is less important than the fact that you are declaring the relevant work.
The following third party code and libraries were used to support this project\footnote{This list does not include the libraries pulled down as dependencies of entries in this list.}:
\begin{itemize}
\item \textbf{opencv-python} - provides many useful functions for working with images and video
\item \textbf{numpy} - provides multi-dementional arrays and mathematical functions which can be applied to said arrays
\item \textbf{matplotlib} - provides functions for plotting and exporting graphs
\item \textbf{pandas} - provides advanced table data structures and mathematical functions to apply to them
\item \textbf{PyYAML} - provides functions to read and write YAML files
\item \textbf{scikit-image} - provides many functions for analysing images
\item \textbf{tqdm} - provides progress bars
\item \textbf{flake8} - (used in testing) provides linting, helping find inconsistencies with style
\item \textbf{pytest} - (used in testing) a python-based testing framework
\item \textbf{pytest-cov} - (used in testing) a plugin for pytest which will provide a report on test coverage
\item \textbf{unittest} - (used in testing) the mock module of unittest was used to use mocks in testing
\item \textbf{mypy} - (used in testing) provides linting for type checking
\item \textbf{torch} - machine learning framework that provides tools for building and training models
\item \textbf{torchvision} - library that contains datasets and models for neural networks
\end{itemize}

View file

@ -0,0 +1,90 @@
\chapter{Code Examples}
%For some projects, it might be relevant to include some code extracts in an appendix. You are not expected to put all of your code here - the correct place for all of your code is in the technical submission that is made in addition to the Project Report. However, if there are some notable aspects of the code that you discuss, including that in an appendix might be useful to make it easier for your readers to access.
%
%As a general guide, if you are discussing short extracts of code then you are advised to include such code in the body of the report. If there is a longer extract that is relevant, then you might include it as shown in the following section.
%
%Only include code in the appendix if that code is discussed and referred to in the body of the report.
\section{Woodpecker pipelines}
\subsection{Lint pipeline}
The lint pipeline was use to automatically report back on any linting issues with the code after a commit.
\begin{verbatim}
pipeline:
flake8:
image: python:3.8
commands:
- python3.8 -m pip install flake8
- flake8 src/
mypy:
image: python:3.8
commands:
- python3.8 -m pip install mypy
- mypy src/
isort:
image: python:3.8
commands:
- python3.8 -m pip install isort
- isort --diff src/
branches: dev
\end{verbatim}
\subsection{Test pipeline}
\begin{verbatim}
pipeline:
unit-tests:
image: python:3.8
commands:
- apt update -y && apt install libgl1 -y
- python3.8 -m pip install -r ./requirements.txt
- python3.8 -m pip install -r ./requirements_dev.txt
- python3.8 -m pip install -e .
- pytest test/ -v
branches: dev
\end{verbatim}
\subsection{Configuration file}\label{config file}
\begin{verbatim}
---
# Configuration file for the autophographer tool
# List of filters to apply in order
# Note: Possible filters include: brightness, filesize, contrast, focus
filters:
- brightness
- filesize
- contrast
- focus
# Whether or not to apply CNN ranking
CNNrank: False
# Ignore video files and don't bother processing them into frames
# Note: Useful if directory contains original video and indivual
# frames from video (prevents processing the same frames more than once)
ignore_video: True
# Options for brightness filter
brightness_options:
threshold: 0.25
# Options for filesize filter
filesize_options:
threshold: 0.35
# Options for contrast filter
contrast_options:
threshold: 0.35
# Options for focus filter
focus_options:
threshold: 0.5
...
\end{verbatim}
\subsection{pil\_loader() method}
This method was taken from a GitHub issue comment \href{https://github.com/python-pillow/Pillow/issues/835\#issuecomment-53999355}{https://github.com/python-pillow/Pillow/issues/835\#issuecomment-53999355}
\begin{verbatim}
def pil_loader(path):
with open(path, 'rb') as f:
image = Image.open(f)
return image.convert('RGB')
\end{verbatim}

View file

@ -0,0 +1,169 @@
\chapter{Background \& Objectives}
%This section should discuss your preparation for the project, including background reading, your analysis of the problem and the process or method you have followed to help structure your work. It is likely that you will reuse part of your outline project specification, but at the end of the project you should have more to discuss.
%
%\textbf{Note}:
%
%\begin{itemize}
% \item All of the sections and text in this example are for illustration purposes. The main Chapters are a good starting point, but the content and actual sections that you include are likely to be different.
%
% \item Look at the document MMP\_S08 Project Report and Technical Work \cite{ProjectReportTechicalWork} for additional guidance.
%
%\end {itemize}
\section{Background}
%What was your background preparation for the project? What similar systems or research techniques did you assess? What was your motivation and interest in this project?
\subsection{Context}
In a world where most people have a camera in their pocket, the number of pictures being taken every minute has exponentially increased in the last few decades. Smartphones cameras have reached the point where they can rival their fully-fledged counterparts. The necessity\footnote{Both \textit{Google} and \textit{Apple} accounts are required to use their respective app stores and \textit{Apple} doesn't support alternatives app stores.} of online accounts on \textit{Android} and \textit{iOS}, which both include cloud storage (\textit{Google Drive / Google Photos} and \textit{iCloud} respectively) have made automatic cloud backup for photographs the norm. Together, these aspects have lowered the entry requirements and have enabled an entire generation to produce large quantities of high-resolution photographs, which have been previously limited to professional and hobbyist photographers.
The upsurge of data produced and people's reliance on free cloud storage has started to create storage concerns for providers. To combat this, certain providers have taken measures to reduce the load by reducing or introducing service limits\cite{google_photos}\cite{google_workspace}. In turn, this had led people to reconsider how they manage their data. People have had to return to the tedious and manual practise of filtering through their old photographs in order to free up space for new content.
The significance of photo selection isn't limited to storage issues. The rise of social media and influencer-culture has culminated in entire businesses that require carefully selected photographs. \textit{Instagram}, a photo-sharing social media platform, has become the most prominent hub for influencer-based marketing. Its popularity has spawned many Instagram curation apps\cite{instagram_planner}\cite{unum}, which helps users and businesses plan out their posts to work towards a cohesive profile aesthetic\cite{instagram_aesthetic}. This has also caught the attention of data scientists who have investigated what makes certain images perform better online than others\cite{intrinsic_image_popularity}.
Image aesthethic quality assessment is the process of automatically assessing the visual aesthetic rating of an image. This area of research has seen recent improvements in the last 5 years due to the advancements in AI, computer vision, and machine learning. On the AVA dataset\footnote{The AVA (Aesthetic Visual Assessment) dataset is the most common dataset used for aesthetic quality assessment.} alone we have seen accuracy rates of up to 83\% when predicting the aesthetic quality rating of images, See figure \ref{fig:aqa-on-ava}.
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{aqa-on-ava}
\caption{Accuracy of models trained on the AVA dataset\cite{aqa-on-ava}}
\label{fig:aqa-on-ava}
\end{figure}
\subsection{Related Work}
\paragraph{Trophy Camera}
In 2017, fine artist Dries Depoorter and professional photographer Max Pinckers developed a camera ``that can only make award winning pictures''\cite{trophy_camera}. This camera was built using a Raspberry Pi and was programmed to only save award-winning photographs, which were subsequently uploaded to a website: \href{https://trophy.camera}{trophy.camera}\cite{trophy_camera_site}. The AI was trained using previous winning photographs from the WPPY (World Press Photos of the Year) contest (\textit{Warning: Website includes images some might find disturbing, including death, extreme violence, and suffering})\cite{wpp}. By comparing the photos featured on \href{https://trophy.camera}{\textit{trophy.camera}} and the previous winning photographs of WPPY, you can see the effectiveness of the project is questionable. This isn't surprising, given the stark difference in subject matter and the photography skills of the general public versus the award-winning photographers. The majority of the photographs submitted to the WPPY are taken by professional photographers using high-end cameras (See figure \ref{fig:trophy camera train photo}) and mostly depict global conflict. In contrast, the Trophy Camera is limited to the walls of a gallery, using a low-end camera, and the photos are taken by the general public (See figure \ref{fig:trophy camera photo}). This is an example where the data used for training doesn't match the desired use-case of the project, resulting in images that fail to achieve an ``award winning'' appearance.
\begin{figure}[H]
\begin{subfigure}[b]{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{trophy-camera}
\caption{Trophy Camera\cite{trophy_camera}}
\label{fig:trophy camera}
\end{subfigure}
\hfill
\begin{subfigure}[b]{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{trophy-camera-image}
\caption{Photo taken with Trophy Camera\cite{trophy_camera_site}}
\label{fig:trophy camera photo}
\end{subfigure}
\hfill
\begin{subfigure}[b]{0.3\textwidth}
\centering
\includegraphics[width=\textwidth]{wpp}
\caption{Example of photo used to train Trophy Camera\cite{wpp-spencer}}
\label{fig:trophy camera train photo}
\end{subfigure}
\caption{Trophy Camera}
\label{fig:trophy camera photos}
\end{figure}
\paragraph{Archillect} A combination of the words ``archive'' and ``intellect'', Archillect\cite{archillect} is an AI that automatically curates visually stimulating content found online. To do this, Archillect uses an algorithm and a list of of keywords that searches for posts and pages, then crawls pages linked from original results to find new images and gain relevant contextual knowledge and learn new keywords. Ultimiately, Archillect learns what images create visual stimulation for people on social media, reposting them and updating the algorithm according "likes" and other user engagement metrics.
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{archillect}
\caption{\href{https://archillect.com}{archillect.com} homepage\cite{archillect}}
\label{fig:archillect}
\end{figure}
\subsection{Motivation}
Historically, the most common and successful approaches to grading the aesthetic quality of images has been through machine learning. This involves creating a neural network, typtically a CNN (Convulution Neural Network), and training it with professional photography based datasets. The most common of these datasets being AVA (Aesthetic Visual Analysis), which is comprised of a set of professionally taken photographs and an aesthetic rating provided by numerous professional photographers that acts as a ground truth for aesthetic quality.
Many people believe aesthetics are subjective. We can observe this when the aesthetics of certain art pieces can often be contested and contraversial. Considering the percieved subjective nature of aesthetics, there seems to be certain ideas that are almost universally accepted as aesthetic. In art we often hear about composition rules like the golden ratio, rule of thirds, and symmetry which are meant to invoke a positive sentiment. See figure \ref{fig:common photographic techniques}. For others, aesthetics can't be reduced down to a set of predifined rules and many people welcome images that break the natural conventions of professional photography.
\begin{figure}[H]
\centering
\begin{subfigure}[b]{0.45\textwidth}
\centering
\includegraphics[width=\textwidth]{golden-ratio}
\caption{Golden Ratio\cite{golden_ratio}}
\label{fig:golden ratio}
\end{subfigure}
\hfill
\begin{subfigure}[b]{0.45\textwidth}
\centering
\includegraphics[width=\textwidth]{symmetry}
\caption{Symmetry\cite{symmetry}}
\label{fig:symmetry}
\end{subfigure}
\begin{subfigure}[b]{0.45\textwidth}
\centering
\includegraphics[width=\textwidth]{rule-of-thirds}
\caption{Rule of Thirds\cite{rule-of-thirds}}
\label{fig:rule of thirds}
\end{subfigure}
\hfill
\begin{subfigure}[b]{0.45\textwidth}
\centering
\includegraphics[width=\textwidth]{dof}
\caption{Shallow Depth of Field\cite{dof}}
\label{fig:y equals x}
\end{subfigure}
\caption{Common photographic techniques}
\label{fig:common photographic techniques}
\end{figure}
\subsubsection{Applications}
Automatic image selection is a useful concept that can be used in many applications. The main motivation for the project was to use the system as a acessibility tool, allowing those who are less technically or physically abled to take good aesthetic-quality images. They use a head-mounted camera and record a journey our outing and the tool could automatically reduce the footage into an album of good-looking photos for them to enjoy.
Professional photographers could use automatic image selection to remove technically poor images from a large collection of images from a photoshoot. This process would help alleviate much of the manual filter photographers have to do post photoshoot.
\subsubsection{Personal motivations}
An important motivation for this project and it's direction is my interest in machine learning. The CS36220 Machine Learning\cite{ml-module} module acted as a good introduction to the topic, but I wanted to get some personal experience with deep learning as it's an area of computer science I find really interesting.
\section{Analysis}
%Taking into account the problem and what you learned from the background work, what was your analysis of the problem? How did your analysis help to decompose the problem into the main tasks that you would undertake? Were there alternative approaches? Why did you choose one approach compared to the alternatives?
%
%There should be a clear statement of the research questions, which you will evaluate at the end of the work.
%
%In most cases, the agreed objectives or requirements will be the result of a compromise between what would ideally have been produced and what was felt to be possible in the time available. A discussion of the process of arriving at the final list is usually appropriate.
%
% When processing video files, each frame should be the same resolution
\subsection{Problem Description}
This project aims to explore the question of "Can we use a computer to make accurate aesthetic judgements on images?".
In order to create a program to detect and select aesthetics images from a video file, the video file needs to be broken down into individual frames, then each frame needs to be analysed against certain aesthetic features and rules like (rule of thirds, contrast, brightness, focus). Each of these filters will need to be implemented in such a way that allows any order and combination of filters to be used. Lastly, a machine learning approach will be taken to rate and rank the remaining images. For this we can use a CNN (Convolution Neural Network) which will need to be trained on professional photographs as they can be considered a good base for what is considered a good ``aesthetic'' quality.
\subsection{Approach}
The specific approach taken for this project is to use a combination of machine learning and conventional image processing techniques. This is a good balance between performance and accuracy as the conventional image processing techniques can filter out the poor technical images (blurry, low contrast, extreme exposure etc.) therefore limiting the amount of processing required by the machine learning model at the end of the pipeline.
\subsection{Alternative Approaches}
This project could also be approached from an exclusively coventional computer vision approach, which would require writing specific code to recognise aesthetic photographic techniques (like rule of thirds and vanishing point: see figure \ref{fig:common photographic techniques}) and using those to help determine the aesthethic quality of an image.
A machine learning only approach could also be taken but for the context of this project this would be computationally wasteful. Due to the nature of a continous video recording, many frames will be blurry or under exposed due to lighting changes. These images are technically poor in quality and processing hundres of them using a CNN would be wasteful when they could be more easily discarded using conventional image processing techniques.
\subsection{Aim}
The aim of the project is to develop a piece of software that will take a video or a set of images as an input, process them through filtering, and output a subset of "aesthetic" images or frames. The set will be processed through filters which will be implementated as part of the project.
\subsection{Objectives}
\begin{itemize}
\item Context specific research into machine learning
\begin{itemize}
\item Research suitable machine learning frameworks
\item Research how to start building a CNN
\end{itemize}
\item Setup tools for development
\begin{itemize}
\item Host a Gitea instance to hsot the project's repository
\item Create repository for project
\item Mirror repository to GitLab.com for availability
\item Setup WoodpeckerCI for CI/CD workloads and connect it to the project repository
\item Setup pipelines for automated testing in WoodpeckerCI
\end{itemize}
\item Research convential image processing techniques
\begin{itemize}
\item Quantify brightness
\item Quantify contrast
\item Quantify focus
\item Depth of field detection
\item Vanishing point detection
\end{itemize}
\end{itemize}
\subsubsection{Research Questions}
\begin{itemize}
\item How effective is the machine learning approach compared to a more conventional one?
\item Which order of filtering is most effective?
\item Can a CNN be used to make aesthetic judgements of an image?
\end{itemize}
%You need to describe briefly the life cycle model or research method that you used. You do not need to write about all of the different process models that you are aware of. Focus on the process model or research method that you have used. It is possible that you needed to adapt an existing method to suit your project; clearly identify what you used and how you adapted it for your needs.
%
%For the research-oriented projects, there needs to be a suitable process for the construction of the software elements that support your work.

View file

@ -0,0 +1,55 @@
\chapter{Evaluation} % 1818 WORDS
%
%Examiners expect to find a section addressing questions such as:
%
%\begin{itemize}
% \item Were the requirements correctly identified?
% \item Were the design decisions correct?
% \item Could a more suitable set of tools have been chosen?
% \item How well did the software meet the needs of those who were expecting to use it?
% \item How well were any other project aims achieved?
% \item If you were starting again, what would you do differently?
%\end{itemize}
%
%Other questions can be addressed as appropriate for a project.
%
%Such material is regarded as an important part of your work; it should demonstrate that you are capable not only of carrying out a piece of work but also of thinking critically about how you did it and how you might have done it better. This is seen as an important part of an honours degree.
%
%There will be good things and room for improvement with any project. As you write this section, identify and discuss the parts of the work that went well and also consider ways in which the work could be improved.
%
%In the latter stages of the module, we will discuss the evaluation. That will probably be around week 9, although that differs each year.
\section{Design Decisions}
\section{Development Process}
Due to the lack of a strict development process, tasks weren't evaluated and took more effort than expected. This lead to the project pulling back its scope over the course of development. A stricter development process would have enabled the project to grow at a steadier pace and evaluating the work and time taken for each work item at the end of each week would have helped with the prioritising of tasks and adapting of scope.
\section{Approach}
The machine learning approach was chosen due to the attractiveness of the possible output rather than it's context as part of a major project. Going down the machine learning path required a lot learning a getting a good grasp of complex topics which took up a large portion of the project. In the end, more time was spent learning about CNNs than expected which lead to less time developing.
\section{Project State}
The project is considered as unfinished as there was much more planned to implement. The filters are basic in concept and in implementation and require using a manually determining a "good" threshold and applying it. If the work were to be continued, the pipeline would process each frame of a video and use the distribution of values from a given filter to determine lower and upper bound outliers which would thn be removed automatically.
\subsection{Implementation}
The overall implementation of the project is incomplete, most of the code was hard-coded and static which made it difficult to work with. With more time the project would be restructured and refactorer to make the code easier to work with and develop further.
\subsection{Testing}
One of the biggest issues with the project is a lack of testing. Due to maining of the bad decisions with the approach of the project and the lack of a strict development process, testing was left until the very end.
\subsection{Experiments}
Many of the experiments only give artificial insight into the level of success of the project rather than providing meaningful metrics.
\section{Report}
The overall quality of the report is quite poor and many of its sections are limited in detail and scope. One of the difficulties that was presented in writing this report was a lack of experience in research-based approaches and report writing. This made it difficult to give the project a research-based context. Much of the detailed design is missing from the report and there's a lack of detail when talking about specific implementation issues.
\section{Time management}
The time management for the project was very poor and it was a struggle to finish everything that was intially planned. This mostly due to a lack detailed planning and a strict development process. This led to an ambiguity of how much work was required to finish the project which became clearer towards the deadline. Near the end of the project, it was difficult to balance project work and other areas of life and personal issues.
\section{Futher Work}
This technical work achieved in this project is a good foundation for further work. Alterations can be made to the CNN to attempt to improve it's accuracy with later iterations. One way that this can be done is to explore fine-tuning the CNN, potentially unfreezing all the layers and training the model again.
If composition filters were implemented (golden ratio, rule of thirds, symmetry etc.) they could be used to boost the detected aesthetic features. If rule of thirds is loosely detected in an image, the option could be added to automatically crop the image in a non-destructive way\footnote{Saving the result as a new image, rather than overwriting the original} so that the rule of thirds has a stronger presence in the image. This could also be applied to filters like brightness and contrast but might be less successful due to the sensitivity in artificially enhancing them.
A good next step would be to refactor and restructure the code to allow plug-in filters, making filter development easier but also allowing other developers to write their own that can be dropped into the pipeline.
\section{Personal conclusion}
Overall this project has been more of a learning experience and an opportunity to work with new technology for me than a well planned experiment. I feel like I have gained a lot while also not having much to show for it.

View file

@ -0,0 +1,13 @@
%\addcontentsline{toc}{chapter}{Development Process}
\chapter{Experiment Methods}
%
%This section should discuss the overall hypothesis being tested and justify the approach selected in the context of the research area. Describe the experiment design that has been selected and how measurements and comparisons of results are to be made.
%
%You should concentrate on the more important aspects of the method. Present an overview before going into detail. As well as describing the methods adopted, discuss other approaches that were considered. You might also discuss areas that you had to revise after some investigation.
%
%You should also identify any support tools that you used. You should discuss your choice of implementation tools or simulation tools. For any code that you have written, you can talk about languages and related tools. For any simulation and analysis tools, identify the tools and how they are used on the project.
%
%For the parts of your project that need some engineering (hardware, software, firmware, or a mixture) to support the experiments, include details in your report about your design and implementation. You should discuss with your supervisor whether it is better to include a different top-level section to describe any engineering work. In this template, Chapter 3 is suggested as a place for that discussion.
\section{Hypothesis}
Hypothesis: Can a CNN and an a set of image processors make accurate aesthetic judgments on images.

View file

@ -0,0 +1,128 @@
\chapter{Results and Conclusions}
%
%This section should discuss issues you encountered as you tried to implement your experiments. What were the results of running the experiments? What conclusions can you draw from these results? What graphs or other information have you assessed regarding your experiments? Discuss those.
%
%During the work, you might have found that elements of your experiments were unnecessary or overly complex; perhaps third-party libraries were available that simplified some of the functions that you intended to implement. If things were easier in some areas, then how did you adapt your project to take account of your findings?
%
%It is more likely that things were more complex than you first thought. In particular, were there any problems or difficulties that you found during implementation that you had to address? Did such problems simply delay you or were they more significant?
%
%If you had multiple experiments to run, it may be sensible to discuss each experiment in separate sections.
\section{CNN}
\subsection{Training}
The first training attempt for the CNN was only run for 20 epochs on a CPU to check for runtime errors. During these 20 epochs the loss gradients didn't plateau meaning the model could benefit from longer training. The next attempt was run for 2000 epochs on a GPU. Figure \ref{fig: 2000 epochs} shows that the training loss started to slow down around 30 epochs and plateaued at around 60. While the validation loss also plateaued around 60 epochs, the variance in loss was much greater. This most likely means the model has converged on an concept of aesthetic quality that doesn't match the concept it was set out to learn. As shown in figure \ref{fig: terminal-loss} the resulting training loss was 0.576810 and the validation loss was 0.604089, this means that on average the model is roughly 60\% off the ground truth with quite a high variance.
\begin{figure}[H]
\begin{subfigure}[b]{0.48\textwidth}
\centering
\includegraphics[width=\textwidth]{plot-20-epochs}
\caption{20 epochs}
\label{fig: 20 epochs}
\end{subfigure}
\hfill
\begin{subfigure}[b]{0.48\textwidth}
\centering
\includegraphics[width=\textwidth]{plot-2000-epochs}
\caption{2000 Epochs}
\label{fig: 2000 epochs}
\end{subfigure}
\caption{Training loss for model}
\label{fig:training loss}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{terminal-loss}
\caption{Terminal output when training}
\label{fig: terminal-loss}
\end{figure}
\subsection{Testing}
\subsubsection{Experiment 1}
Due to the lack of tuning in the pipeline: experiment 1 is non-functional.
The aim of experiment 1 is to pass a batch of of AVA images and their ratings and compared the average rating of the batch before and after filter. If the average rating has increased it tells us that the pipeline is doing some for of effective filtering. It's not a well thought out experiment so the results are difficult to quantify.
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/experiment-1}
\caption{Experiment 1}
\label{fig: experiment 1}
\end{figure}
\subsubsection{Experiment 2}
Due to the lack of tuning in the pipeline: experiment 2 is non-functional.
The aim of this experiment was to see which filter could reduce the set the largest amount. The same set of images is passed through each filter once and the remaining images are plotted.
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/experiment-2}
\caption{Experiment 2}
\label{fig: experiment 2}
\end{figure}
\subsubsection{Experiment 3}
During the third experiment, there was an attempt to visualize how the model performed on real data. The model was used to predict the aesthetic quality rating of the frames in the videos provided by Dr Hannah Dee. The ratings were then ranked and the top 5 ranking and lowest 5 ranking entries were plotted. This experiment isn't an accurate metric of how well the model performs, but it useful to see how it performed on data that wasn't from the AVA dataset. Common accross all of the following examples is that the top 5 ranking images lay in the low-6s while the bottom 5 images lay in the high-5s. There doesn't seem to be as much variance as expected from the data, which includes a few shots in very poor lighting conditions and very blurry frames. This might possibly be an architecture issue with the training of the CNN resulting in most images being predicted in the range of 5-6.
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-overcastjuly-melindwr1}
\caption{overcastjuly-melindwr1}
\label{fig: overcastjuly-melindwr1}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-overcastjuly-melindwr2}
\caption{overcastjuly-melindwr2}
\label{fig: overcastjuly-melindwr2}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-sunnyaugust-camels_hump}
\caption{sunnyaugust-camels\_hump}
\label{fig:sunnyaugust-camels_hump}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-sunnyaugust-diggers_end}
\caption{sunnyaugust-diggers\_end}
\label{fig: sunnyaugust-diggers_end}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-sunnyaugust-drunken_druid}
\caption{sunnyaugust-drunken\_druid}
\label{fig: sunnyaugust-drunken_druid}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-sunnyaugust-hippety_hop}
\caption{sunnyaugust-hippety\_hop}
\label{fig: sunnyaugust-hippety_hop}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-sunnyaugust-melindwr1}
\caption{sunnyaugust-melindwr1}
\label{fig: sunnyaugust-melindwr1}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-sunnyaugust-melindwr2}
\caption{sunnyaugust-melindwr2}
\label{fig: sunnyaugust-melindwr2}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/exp-3-sunnyaugust-spaghetti_junction}
\caption{sunnyaugust-spaghetti\_junction}
\label{fig: sunnyaugust-spaghetti_junction}
\end{figure}
\subsubsection{Experiment 4}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{experiments/experiment-4.png}
\caption{Artificial modifications and their ratings}
\label{fig: experiment 4}
\end{figure}
In the 4\textsuperscript{th} experiment. Aritificial data was used to test the model's performance. A base image\cite{unsplash} was used and then artificially edited with GIMP\cite{gimp} to lower it's aesthetic quality. The first image image in figure \ref{fig: experiment 4} is the original photo, the second has had it's contrast lowered, the third has had it's exposure increased, and the fourth has had a gaussian blur applied to it. From the results it's clear to see the model has no understanding of the decline in quality in the edited picture. In this example all of the edited pictures actually score higher than the original picture.
\section{Filters}
\subsection{Filesize}
\subsection{Contrast}
\subsection{Brightness}
\subsection{Focus}

View file

@ -0,0 +1,283 @@
\chapter{Software Design, Implementation and Testing}
\section{Development Process}
Originally, the plan was for the project to follow a scrum-like methodology, but due to the research-based nature of the project and working with new technologies -- it was difficult to estimate the time and effort required for certain tasks.Naturally, the weekly meetings with the project supervisor acted as a weekly review of the work done and what work was to be done the following week. When the project repositories were set up, a Kanban board was set up alongside it. This was used to keep track of the progress of tasks and features. Milestones were created to group small tasks together that lead to a bigger task being completed.
%blog
% kanban
\section{Languages and Frameworks}
\subsection{Python}
Python\cite{python} was selected as the programming langauge for the project due to it's popularity in data science and it's dynamic and weakly typed nature. This enables an easier transisition from project ideas to working code. It was also chosen for it's use in machine learning as it seems to be the main language in most of the top machine learning libraries.
\subsection{PyTorch}
The most common frameworks for deep learning are \textit{TensorFlow}\cite{tensorflow}, \textit{PyTorch}\cite{pytorch}, and \textit{Keras}\cite{keras}.
\textit{TensorFlow}, arguably the most known of the three, is developed by Google. It's mostly used in Python but can also be used with C++, Java, and Javascript. It provides a powerful framework for ML (Machine Learning) and AI (Artificial Intelligence) tasks. Although, TensorFlows primarily focuses on creating deep neural network and training.
\textit{Keras} is a high-level deep-learning API used to interface with TensorFlow. It's main aim is to simplify the process of developing deep neural networks for use in quick experimentation.
The last of the three, \textit{PyTorch}, is primarily developed by Facebook and acts as the main competitor to TensorFlow. It's gained popularity within the research community due to it's ease of use and gradual learning curve compared to TensorFlow. %why?
The decision was made to use PyTorch as machine learning isn't the primary aspect of this project. The reduced complexity should speed up development time.
%This could be one chapter or a few chapters. It should define and discuss the software that is developed to support the research that is being conducted. For example, if your research involves running experiments, what software are you creating to support that work? What functionality is required? What design will be used? What implementation issues are there and what testing is used?
%
%Even though a research project is investigating specific research questions, it is still necessary for you to discuss the software that you develop. Research has a habit of generating bits of software that can exist for several years and need future modification. Therefore you need to be able to discuss the technical issues as well as the research approach.
%
\section{Software Tools and Technologies}
\subsection{VS Code}
The IDE/Editor used for this project is VS Code\footnote{More specifically: VS Codium. A libre-binary version of Microsoft's VS Code. It's built from VS Code's source albeit with tracking and telemetry disabled and then distributed at binaries. It's essential VS Code, so it'll continued to be referenced as such.}. This was chosen as it was a familiar and popular IDE/Editor with good plugin support and features.
\subsection{Docker}
Docker containers were used to host tools like Gitea and Woodpecker on a private server to aid with development. They were also used to create reproducible development and testing environments\footnote{a Dockerfile and Docker-compose file are supplied with the source code for this project.}.
\subsection{Gitea}
For version control this project uses git and Gitea as an open-source and self-hostable git host. Gitea also includes workflow features like bug tracking via ``issues'', kanban boards (see figure \ref{fig:kanban}), and plugins to work with other tools (see section \ref{woodpecker}). The source code for this project is hosted on a personal instance of Gitea hosted at \href{https://git.oscar.blue}{git.oscar.blue}. The project's repository is also mirrored\footnote{Sync interval is every 8 hours.} to \href{https://gitlab.com}{gitlab.com} for availability.
\begin{figure}[H]
\begin{subfigure}[b]{0.4\linewidth}
\centering
\includegraphics[width=\textwidth]{gitea}
\caption{Original Gitea version}
\label{fig:gitea}
\end{subfigure}
\hfill
\begin{subfigure}[b]{0.4\linewidth}
\centering
\includegraphics[width=\textwidth]{gitlab}
\caption{Mirror Gitlab version}
\label{fig:gitea}
\end{subfigure}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{kanban}
\caption{Gitea Kanban board}
\label{fig:kanban}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{milestones}
\caption{Project milestones}
\label{fig:milestones}
\end{figure}
\subsection{WoodpeckerCI}\label{woodpecker}
WoodpeckerCI, an open-source community fork of DroneCI, is a CI/CD (Continuous Integration/ Continous Deployment) platform used to create automated piplines for tesing and building. A personal instance was hosted at \href{https://woodpecker.oscar.blue}{woodpecker.oscar.blue} and linked to the project's git repository for CI/CD.
\subsubsection{Woodpecker pipelines}
\begin{figure}[H]
\centering
\includegraphics[width=0.75\textwidth]{woodpecker-pipelines}
\caption{Woodpecker Pipelines}
\label{fig:pipelines}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=0.75\textwidth]{woodpecker-pipeline}
\caption{A Woodpecker Pipeline}
\label{fig:pipeline}
\end{figure}
Two pipelines were created, one to establish automated testing and the other for source code linting.
\subsection{Terraform}
When training the CNN, Terraform was used to create GPU-enabled cloud compute resources on the fly so that it could easily be built up and torn down during training. Although this code was barely used in the end due to moving away from cloud solutions (See section \ref{Training model} for details).
\section{Design}
%You should concentrate on the more important aspects of the design. It is essential that an overview is presented before going into detail. As well as describing the design adopted it must also explain what other designs were considered and why they were rejected.
%
%The design should describe what you expected to do, and might also explain areas that you had to revise after some investigation.
%
%Typically, for an object-oriented design, the discussion will focus on the choice of objects and classes and the allocation of methods to classes. The use made of reusable components should be described and their source referenced. Particularly important decisions concerning data structures usually affect the architecture of a system and so should be described here.
%
%How much material you include on detailed design and implementation will depend very much on the nature of the project. It should not be padded out. Think about the significant aspects of your system. For example, describe the design of the user interface if it is a critical aspect of your system, or provide detail about methods and data structures that are not trivial. Do not spend time on long lists of trivial items and repetitive descriptions. If in doubt about what is appropriate, speak to your supervisor.
%
%You should also identify any support tools that you used. You should discuss your choice of implementation tools - programming language, compilers, database management system, program development environment, etc.
%
%Some example sub-sections may be as follows, but the specific sections are for you to define.
\subsection{Overall Architecture}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{HL-Architecture-Diagram}
\caption{High Level Architecture}
\label{fig:high level architecture}
\end{figure}
\subsection{Detailed design}
\subsubsection{CNN model architecture}
\begin{figure}[H]
\centering
\includegraphics[height=0.5\textwidth]{HL-Diagram-CNN-Simplified}
\caption{Simplified CNN Architecture}
\label{fig:simplified CNN architecture}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[height=0.9\textheight]{HL-Diagram-CNN}
\caption{CNN Architecture}
\label{fig:detailed CNN architecture}
\end{figure}
\subsubsection{Functions}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{diagrams/filter_paths}
\caption{filter\_paths() function}
\label{fig:2 detailed CNN architecture}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=\textwidth]{diagrams/load_config}
\caption{load\_config() function}
\label{fig:2 detailed CNN architecture}
\end{figure}
\begin{figure}[H]
\centering
\includegraphics[width=0.75\textwidth]{diagrams/filter_brightness}
\caption{filter\_brightness() function}
\label{fig:filterbrightness}
\end{figure}
\subsection{Datasets}
\paragraph{AVA}
The most common dataset used for analysing image aesthetics is the AVA (Aesthetics Quality Assessment)\cite{ava_paper}. This dataset is comprised of over 250,000 photographs taken from DPChallenge\cite{dpchallenge} including a varied selection of metadata including: a large number of aesthetic ratings from users for each photograph, over 10 photography style labels (macro, rule of thirds, shallow depth of field etc. ), and over 60 semantic lables (landscape, animal, wedding etc.).
In the paper released alongside the dataset\cite{ava_paper}, it references that the dataset is available at \href{www.lucamarchesotti.com/ava}{www.lucamarchesotti.com/ava}. Although, this link has been down for the past 5 years. To work around this, there are many scripts online which will scrape DPChallenge to build the dataset for you. Others have decided to upload the entire dataset online for quicker access and from preventing users from being rate-limited by DPChallenge.com while scraping. For this project the dataset was downloaded from MEGA (\href{https://mega.nz/folder/9b520Lzb\#2gIa1fgAzr677dcHKxjmtQ}{mega.nz/folder/9b520Lzb\#2gIa1fgAzr677dcHKxjmtQ}) which was provided by a GitHub repository\cite{ava_dataset} which also included AVA download-helper scripts.
\paragraph{AADB}
Another common dataset is AADB (Aesthetics and Attributes Database)\cite{aadb_paper}. Unlike AVA, AADB uses photos from Flickr\cite{flickr} an image sharing site which is less targeted towards professional photographers and therefore hosts a more varied range of aesthetic quality compared to DPChallenge. Although AADB uses only 10,000 images compared to AVA's 250,000. In AADB, rater identity is anonymously recorded a tracked across different photos which they use in their sampling strategy to contexualize the subjectivity of aesthetic tastes.
As the project is aimed to help the general public to select high aesthetic quality frames from videos, it may make more sense to use the AADB dataset as it contains more amateur photography compared to AVA. Although, AVA has the advantage of being a larger dataset. Both datasets could be helpful in working with the project.
\section{Implementation}
%This section should discuss issues you encountered as you tried to implement your experiments. What were the results of running the experiments? What conclusions can you draw from these results?
%
%During the work, you might have found that elements of your experiments were unnecessary or overly complex; perhaps third-party libraries were available that simplified some of the functions that you intended to implement. If things were easier in some areas, then how did you adapt your project to take account of your findings?
%
%It is more likely that things were more complex than you first thought. In particular, were there any problems or difficulties that you found during implementation that you had to address? Did such problems simply delay you or were they more significant?
%
%If you had multiple experiments to run, it may be sensible to discuss each experiment in separate sections.
\subsection{Configuration}
A YAML configuration file\footnote{See example configuration file in Apendix B section \ref{config file}.} was created to enabled quick changes to the filters and pipeline options. YAML was selected as it's a human readable and allows comments unlike JSON. The main options were:
\begin{itemize}
\item \textbf{filters} - declare which filters you want to use, and in which order (default)
\item \textbf{CNNrank} - declare whether you want to rank the remaining images using a CNN (default: False)
\item \textbf{ignore\_video} - declare wether you want to ingore video files when loading in paths (default: False)
\item \textbf{brightness\_options} - declare options for brightness filter
\item \textbf{filesize\_options} - declare options for filesize filter
\item \textbf{contrast\_options} - declare options for contrast filter
\item \textbf{focus\_options} - declare options for focus filter
\end{itemize}
\subsection{Pipeline}
The overall pipeline takes in a set of paths and will applying image processing in layers to filter the lower aesthetic quality images and thus reducing the set of paths (see figure \ref{fig:high level architecture}).
% video is continuous(wrong word) so you get context
\begin{itemize}
\item Loads in the video or image
\item Runs through each filter and ouputs a subset of the list of paths
\end{itemize}
\subsection{Filters}
In general, each filter takes in a list of paths and some options (for thresholds etc). The filter will attempt to reduce down the list of paths based on it's filtering technique and threshold. The reasoning behind each filter taking a list of paths as an input was to enable future functionality which would process the frame in context to the other frames. Based on the filter type, some filters will only have a lower bound (e.g discarding filesizes below a certain threshold) while others will include both a lower and higher bound (e.g. discarding images that are too dark or too bright in the brightness filter).
\subsubsection{Filesize}
When comparing file sizes it's safe to assume that larger file sizes contain more information, more information might allude to a more interesting and thus more likely to hold a higher aesthetic quality. When filtering by filesize, only the lower bound is considered. In other words: only filter out images below a certain filesize. To do this, set a threshold and compute the filesize of a given file. If the filesize is lower than the threshold then the image is discarded.
% only has lower bounds
\subsubsection{Brightness}
To quantify an image's brightness the L*a*b* colour space is used. This colour space is meant to reflect human perception of light and colour. In the case where brightness is only concerned, the L (Luminance) channel can be used as a comparable metric to human perception of brightness. To use this effectively, the image is first converted to L*a*b, then the pixel values for the L channel are normalised by dividing each pixels value by the maximum value, then the average of pixel brightness is calculated. This will give us our average luminance of the image. Here brightness should have a lower and upper bound. As some images can be over exposed or under exposed. Both extremes result in poor image quality. Therefore, we can set a percentile band removing the images that land in the extreme upper and lower bounds.
\subsubsection{Contrast}
To detect contrast an existing function in the skimage\cite{skimage} library called "is\_low\_contrast" method. This method will take an image and a threshold and return true or false depending on if the contrast is above or below the threshold value. Again, like brightness, contrast has lower and upper bounds that need to be removed as an image can have too much contrast or too little. When researching how to calculate an image's contrast levels, the following post was found and followed as a guide\cite{pyis_contrast}.
% has upper and lower bounds
\subsubsection{Focus}
There are a couple of popular methods for detecting if an image is blurry or not.
For detecting blurry images there are two common methods. Laplacian method and fast farrier transform. The laplacian variance method. This method can actually be implemented fairly easily using cv2. You can compute the variance of the laplacian of an image then you can just take the standard deviation squared. A low variance indicates that there aren't many edges in the image, something we can see in blurry pictures. While a high variance means there are a lot of edges and therefore the image is less likely to be blurry. Although this method alone won't allow us to predict if an image with a low depth of field, with a subject is in focus or not as it will just return as "this image is somewhat blurred" which doesn't tell us much. This method works well when you can compute an good level of focus before hand and then you can remove the outliers "(like for a consistent video feed). The following guides were used as a resource for this methods\cite{pyis_laplace}\cite{pyis_fft}.
% only how lower bounds but a bit more complex
\subsection{CNN}
\subsubsection{CNN architecture}
Originally, the plan was to build a CNN from scratch using PyTorch. This required a lot of research as the CS36220 Machine Learning\cite{ml-module} module didn't cover the details of CNNs nor their practical implementation. To gain a better theory understanding, Stanford University's online lecture material covering deep learning software\cite{stanford_dl} and CNN architecture\cite{stanford_cnn} was used. For practical knowledge several sources PyTorch's official tutorial for training\cite{pytorch-training} and pyimagesearch's guide on training a CNN\cite{pyis-cnn} were used a foundation for understanding the how to use PyTorch -- which is also where the idea of using a separate configuration file for storing variables for batch sizes and validaton splits was inspired from. While attempting to build a CNN, existing CNN approaches\cite{dcnn}\cite{aadb_paper} were researched in detail to build foundational knowledge of how to tackle the shared problem of judging image aesthetic quality. It soon became clear that most sucessful models were using transfer learning instead of building a CNN from scratch. Transfer learning is the process of using the knowledge learned by an existing model and transferring it to a new model which is attempting to solve a similar issue.
After understanding its benefits, a transfer learning approach to creating a CNN was adopted. This approach was more suited for the project due to the time contraints and the approach's increased probability to create a good working model.
ResNet50\cite{pytorch-resnet} was chosen as the base model as ResNet based networks have historically performed really well in transfer learning due to their large amounts of hidden layers. The large amount of hidden layers led to it winning all classification and detection competitions in both ILSVRC (ImageNet Large Scale Visual Recognition Challenge) and Microsoft's COCO (Common Objects in Context) in 2015. ResNet50 is a ResNet model with 50 hidden layers (See figure \ref{fig:detailed CNN architecture}) and 1000 different classes. It is trained on the ImageNet\cite{imagenet} dataset, a large dataset comprised of over 14 million annotated images. This dataset is used in ILSVRC as a benchmark for models in classification and object detection.
Although this project's problem could be implemented as a classficiation problem in the sense of "Is this image of high aesthetic quality?" with two classes "Yes" and "No", we would achieve better insight through a regression-based model asking "How high is the aesthetic quality of this image?" which would return a rating between 0 and 10. To learn specifically how to implement a regression-based model, a couple of guides\cite{pyis_keras_cnn}\cite{pyis_keras} surrouding the building of regression-based models using the real estate dataset\cite{house_dataset}. Transfering the knowledge from the ResNet50 model and altering the model to fit our problem requires a few steps. The pyimagesearch's guide for transfer learning in PyTorch for classification problems\cite{pyis_pytorch_transferlearning} was also used as a guide and much of the structure of the model.py code is inspired by this post.
\begin{enumerate}
\item Load ResNet50 model
\item Freeze all layers except the last
\item Remove final layer
\item Add a fully connected layer with one output and a relu function
\item Train using new problem data (AVA dataset)
\end{enumerate}
The original plan was to train the two models, once with the AVA dataset and once with the AADB dataset and compare their accuracy. Due to time contrainst and the other aspects of the project being neglected, this had to be dropped and only the AVA dataset was used in the end.
\subsubsection{Working with the dataset}
The first step to working with the AVA dataset is to load in the data in the format we need it. As the images in AVA were rated from a scale of 1-10 we also want our model to ouput a rating between 1-10. The way AVA exposes these ratings is by showing the count of each rating (see figure \ref{fig: ava-entry}). In other words, for each image it shows you how many users rated the image (1/10, 2/10 etc.). For this project's use-case, these counts of ratings aren't very useful. They need to be processed into a single value. To do this, each rating is given a multipler (a rating of 1 is multiplied by 1, rating of 2 is multiplied by 2 etc.). The sum of all of these new multipied ratings is then dividing by the total number of casted votes. This should give us a single value rating between 1-10 (see figure \ref{fig: calculate_image_rating}).
\begin{figure}[H]
\includegraphics[width=\linewidth]{AVA-entry}
\caption{AVA Dataset entry}
\label{fig: ava-entry}
\end{figure}
\begin{figure}[H]
\includegraphics[width=0.75\linewidth]{calculate_image_rating}
\caption{calculate\_image\_rating() method}
\label{fig: calculate_image_rating}
\end{figure}
\subsubsection{Training model}\label{Training model}
In order to train the model, PyTorch's in-built training functionality was used. Intially, the model was trained for 20 epochs on the CPU to validate that there weren't any runtime errors. In order to run the training for 20 epochs on a CPU it took just over 4 hours, meaning 2000 epochs would take over 400 hours or roughly 16 and a half days to train on a CPU. In order to speed up processing GPU-compute power was required. PyTorch has long supported using CUDA on Nvidia GPUs for accelerated workloads but has only recently implemented support for AMD's equivalent: ROCm\footnote{As of writing ROCm support for PyTorch is still in beta.}\cite{pytorch-rocm}\cite{rocm-ml}. ROCm also has a lack of installation targets - mostly packaging the stack for enterprise Linux distributions like RHEL (Red Hat Enterprise Linux), SLES (SUSE Linux Enterprise Server), and Ubuntu\footnote{Ubuntu requires the HWE (Hardware Enablement) kernel.}. Fortunately, ROCm is open source, unlike Nvidia's counterpart -- CUDA. This meant that theoretically ROCm could be compiled from source and installed manually. When this was attempted, the machine compiling the package ran out of memory and halted. ROCm was abandonned and the choice to move forward with CUDA was made. As the computer used for the project didn't have an Nvidia GPU, the option of using cloud computing to was explored.
The next step involved researching a number of cloud providers which offered GPU compute and comparing prices. AWS (Amazon Web Services), the biggest global cloud provider to date, was unrealistically expensive so GCP (Google Cloud Platform), Azure, and Linode were considered instead. Although each of these platforms advertise GPU computer services, each platform has a GPU quota which is automatically locked to zero. This require contacting support to have the restriction lifted or in the case of Linode there's an additionally requirement to have \$100 or more in transactions before asking to remove the lock. After contacting GCP support, the quota for GPUs was increased to one which enabled the CNN model to be trained using Google's cloud GPU compute.
As part of the project and the drive for creating an optimal development environment, IaC (Infrastructure as Code) was used to rapidly automate deployments of cloud resources, minimising cost and saving time when training. Terraform, a cloud agnostic IaC tool, was used to define the architecture for the cloud resources required to run a training workload. Unexpectedly, working with GCP was harder than expected due to it's novel methods. By default, GCP creates a user account for a newly created server and disables password authentication. Instead, GCP creates an SSH key pair and stores the private key within the GCP platform which isn't viewable by the user. To log in, users have to use the gcloud CLI application which connects to their GCP account which then in turn fetches the private SSH key and automatically logs the user in. Due to this convoluted process, it was difficult to automate the post-installation commands required to set up the training environment. Further issues arised when the decision was made to use Google's official machine learning images which were advertised as ``the easiest way to get started because these images already have the NVIDIA drivers and CUDA libraries pre-installed''\cite{gcp-gpu}. In reality, CUDA was not preinstalled. Instead, an installation script ran on boot which would install the aforementioned drivers and libraries. Unfortunately, Google's script clashed with the project's post-installation script as both of them used apt. This would result in the project's script locking apt when installing it's dependencies, preventing Google's script from using apt to install the drivers. Google's script would then terminate but wrongfully output that the drivers had been installed.
Around the same time, the university offered access to use their GPU compute cluster on the condition that the project only used one GPU from their old server. The university's GPU compute cluster managed jobs using slurm\cite{slurm} which enqueues an new job to the cluster. The resources on the old server limited it to job very few concurrent jobs, creating a scenario where existing jobs would have to finish before new ones could start. This was ultimately the best option considering it was being provided at no extra cost and the setup wasmuch more straight-forward compared to GCP.
The university's GPU cluster uses Anaconda\cite{anaconda} virtual environments to run experiements. Once a new environment was created and the depencies for the model training code were installed, a model was able to be trained. This code trained the model for 2000 epochs and saved a plot the train and validation loss.
To speed up processing time, a batch of the dataset was pre-processed with translations and saved to tensors. This was originally done to speed up runtime troubleshooting in the training code by cutting out the time it took to process the images before each run.
\section{Testing}
%Detailed descriptions of every test case are definitely not what is required in this section; the place for detailed lists of tests cases is in an appendix. In this section, it is more important to show that you adopted a sensible strategy that was, in principle, capable of testing the system adequately even if you did not have the time to test the system fully.
%
%Provide information in the body of your report and the appendix to explain the testing that has been performed. How does this testing address the requirements and design for the project?
%
%How comprehensive is the testing within the constraints of the project? Are you testing the normal working behaviour? Are you testing the exceptional behaviour, e.g. error conditions? Are you testing security issues if they are relevant for your project?
%
%Have you tested your system on ``real users''? For example, if your system is supposed to solve a problem for a business, then it would be appropriate to present your approach to involve the users in the testing process and to record the results that you obtained. Depending on the level of detail, it is likely that you would put any detailed results in an appendix.
%
%Whilst testing with ``real users'' can be useful, don't see it as a way to shortcut detailed testing of your own. Think about issues discussed in the lectures about until testing, integration testing, etc. User testing without sensible testing of your own is not a useful activity.
%
%The following sections indicate some areas you might include. Other sections may be more appropriate to your project.
\subsection{Overall Approach to Testing}
Due to the explorative and experimental nature of this project, a lot of code was used temporarily or was CNN based which is difficult to write tests for. Therefore testing was left till the end when the project when there was more structure and finalised pieces.
\subsection{Automated Testing}
CI/CD pipelines were set up to facilitate automated testing using pytest and pytest's conv plugin which shows a breakdown of test coverage on the codebase.
\subsubsection{Unit Tests}
Only 4 unit tests were implemented. Three of the tests check the functionality to calculate a single value for the AVA ratings. As a lot of the functions in the this project rely on complex datastructures like panda DataFrames and images testing was quite difficult and the use of mocks was necessary.
As the "calculate\_image\_rating()" function reads a CSV file and stores it as a dataframe, a mock dataframe had to be created for each test.
\paragraph{test\_calculate\_image\_rating\_with\_no\_ratings}
This test is checking that if all ratings for an image are 0 then the resulting single score would also be zero. This test intially failed and made aware the fact that by following the logic of the algorithm it would try to divide 0 by 0. This caused a runtime error and therefore a specific check had to be put into the function which checked if the summer score equated to 0. If it did, it would avoid diving by the total number of votes and instead just return 0.
\paragraph{test\_calculate\_image\_rating\_multiple\_ratings}
This test will check that the algorithm accurately calculates a single rating value.
\paragraph{test\_calculate\_image\_rating\_incorrect\_number\_of\_columns}
This test checks that the right exception is called when it attempts to calculate the image rating on an entry which has more than 15 columns which is the number of columns an entry had in the AVA dataset.
\paragraph{test\_load\_config\_nonexistent\_file}
This test checks if the right exception is thrown when an non-existent file is attempted to load using the "load\_config()" function in the autophotographer\_main.py script. This test implemented very well as it relies on the filesystem of the user. If the user has a file named "blahblahblah" on their system in the right location then the test will falsely fail.
\subsection{CNN testing}
When testing the CNN performance, a random small batch of 4 images is selected and then their rating is predicted and plotted. Although as the images are fetched from the validation loader, they are pre-processed which has led to processed images being shown in the reduced size they are scaled down to for training.
\begin{figure}[H]
\centering
\includegraphics[width=0.5\textwidth]{cnn-testing}
\caption{CNN Testing}
\label{fig: cnn testing}
\end{figure}
As observed in figure \ref{fig: cnn testing}, the CNN isn't very accurate as predicting the aesthetical quality of these images and the variance in prediction/ground truth is quite large between images.

View file

@ -0,0 +1,15 @@
\thispagestyle{empty}
%TC:ignore
\section*{\centering Abstract}
%Include an abstract for your project. This should be approximately 300 words.
%
%The abstract is an overview of the work you have done. Highlight the purpose of the work and the key outcomes of the work.
% Why aesthetics is an active research area
% conferences about it etc...
The goal of the Autophotographer project is to explore the idea of combination of machine learning and convential image processing techniques to select images with a high-level of aesthetic quality.
%TC:endignore

View file

@ -0,0 +1,14 @@
\thispagestyle{empty}
%TC:ignore
\section*{\centering Acknowledgements}
I would like to thank the following people for their support and contributions towards the project:
\begin{itemize}
\item my major project supervisior, Dr Hannah Dee, for helping me keep calm and collected thoughout the project and for keeping me on track.
\item Aberystwyth University and Dave Price for providing me access to the unviersity's GPU compute cluster and providing assistance with using it.
\end{itemize}
%TC:endignore

View file

@ -0,0 +1,32 @@
%TC:ignore
\title{Autophotographer}
% Your name
\author{Oscar Pocock}
% Your email
\authoremail{osp1@aber.ac.uk}
\degreeschemecode{G401} %e.g. G400
\degreeschemetitle{Computer Science (with integrated year in industry)} % e.g. Computer Science
\degreetype{BSc}
\modulecode{CS39440} % i.e. CS39440, CC39440, CS39620
\moduletitle{Major Project} % i.e. Major Project or Minor Project
%\date{19th February 2019} % i.e. the date of the current version of your report
\date{\today} % i.e. the date of the current version of your report
\status{Release} % Use draft until you create the release version. Then, change this to Release.
\version{1.0}
% The title and name of your supervisor.
\supervisor{Dr. Hannah Dee}
%The email for your supervisor.
\supervisoremail{osp1@aber.ac.uk}
\maketitle
%TC:endignore

View file

@ -0,0 +1,54 @@
\thispagestyle{empty}
%TC:ignore
%%%
%%% You must sign the declaration of originality.
%%%
%%% You are submitting this electronically. Therefore, to sign, you
%%% type your name and date to replace the .... characters.
%%%
\section*{\centering Declaration of originality}
I confirm that:
\begin{itemize}
\item{This submission is my own work, except where
clearly indicated.}
\item{I understand that there are severe penalties for Unacceptable Academic Practice, which can lead to loss of marks or even the withholding of a degree.}
\item{I have read the regulations on Unacceptable Academic Practice from the University's Academic Registry (AR) and the relevant sections of the current Student Handbook of the Department of Computer Science.}
\item{In submitting this work I understand and agree to abide by the University's regulations governing these issues.}
\end{itemize}
\vspace{2em}
Name: Oscar Pocock\\
\vspace{1em}
Date: \today \\
%%%
%%% We would like to make a selection of final reports available to students that take
%%% this module in future years. To enable us to do this, we require your consent. You
%%% are not required that you do this, but if you do give your consent, then we will have
%%% the option to select yours as one of a number of reports as examples for other
%%% students. If you would like to give your consent, then please include the following
%%% text and type your name and date to replace the .... characters.
%%%
%%% If you do not wish to give your consent, please remove this from your report.
%%%
\vspace{1em}
\section*{\centering Consent to share this work}
By including my name below, I hereby agree to this project's report and technical work being made available to other students and academic staff of the Aberystwyth Computer Science Department.
\vspace{2em}
Name: Oscar Pocock\\
\vspace{1em}
Date: \today \\
%TC:endignore

40
docs/report/README.md Normal file
View file

@ -0,0 +1,40 @@
# LaTeX Template for MMP Report
This folder contains a LaTeX template for the project report.
In a change to previous years, there is one combined template for the Engineering-oriented and Research-oriented projects. The organisation of the files has been simplified to make it easier to maintain this resource.
## Engineering or Research
Within `mmp-report.tex`, there is a defintion for `\reporttype`. This can have one of two values: `Engineering` or `Research`. This value is used to determine which of the two `Chapters_x` folders are read, where x is one of `Engineering` or `Research`.
The intention is that you only use chapters from one of these two directories. They are handled in this way to make it easier to manage the production of the template. If you are doing an engineering-oriented project, then you do not need to keep the research-oriented files.
## Colours
The title, chapter and section headings are assigned a blue colour within the template. To control that colour and, if you prefer, remove the colour to simply use the default font colour, you can adjust the setting in `mmp-report.tex`. Look for the comments on `Title and Section Colours`.
## Style File
The LaTeX style file is defined in `StylesAndReferences/mmp-report.sty`. Editing this file is for those who are advanced users of LaTeX or the curious who are willing to learn something new. One area you may want to modify is the default choice of font. The document is setup to use `tgheros`, a sans serif font. You may prefer a different font or even a serif font. Look for the `% Fonts` comment in the style file to find the elements to change.
## Text alignment
The default text alignment in a LaTeX document is to justify the text. That can cause a problem for some people, such as those with dyslexia. The LaTeX command `\raggedright` has been added to the main report document.
## References
The set of example references is contained in the file `StylesAndReferences/references.bib`. This has been manually created but it could easily be something that is generated from a reference management tool such as Mendeley.
Two example reference styles are provided for in the `mmp-report.tex` file. One is an author-date style (`StylesAndReferences/authordate2annot`) and one is an IEEE style (`StylesAndReferences/IEEEannotU`).
The two styles are capable of producing annotated references. For projects submitted in 2022, it has been decided that we will not require annotated references. The two style files have been updated to stop the generation of the annotations.
## Source of help for LaTeX
The internet is not short of good resources to help you get the most out of LaTeX. One useful resource is hosted by [Overleaf](https://www.overleaf.com/learn/latex/Main_Page), which is also an online LaTeX editor.
A LaTeX channel is also available on the Discord community server that has been setup for the module. If you have questions about this template, then please ask there and possibly share your experience. To find out how to access the Discord server, see the home page for the module in Bb.
Neil Taylor
28th February 2022

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

File diff suppressed because it is too large Load diff

View file

@ -0,0 +1,358 @@
%% LaTeX class to write Major/Minor projects for the Computer Science Department,
%% Aberystwyth University.
%%
%% Written by Neil Taylor
%% Comments and bugs to nst@aber.ac.uk
%%
%% See the accompanying file MMP_ProgressReport_example.tex for an example on how to
%% use it.
%%
\ProvidesPackage{mmp-report}
%\usepackage[a4paper,margin=2.5cm,nohead, headheight=3cm, headsep=20pt]{geometry}
\usepackage{graphicx} %to allow images to be imported see example at end of template
\usepackage{palatino} %or {times} etc
\usepackage{plain} %bibliography style
\usepackage{amsmath} %math fonts - just in case
\usepackage{amsfonts} %math fonts
\usepackage{amssymb} %math fonts
% Packages used to change colours for fonts
\usepackage{xcolor}
\usepackage{sectsty}
% packages for figures
\usepackage{graphicx}
\usepackage{subcaption}
\usepackage{float}
% Fonts
% The following font will use a Sans Serif font throughout the document
% See https://www.overleaf.com/learn/latex/Font_typefaces for discussion
% of other fonts. Include both lines to use a sans serif font throughout the
% document. To use a Serif font (e.g. Times), then comment out the following
% two lines.
\usepackage{tgheros}
\renewcommand{\familydefault}{\sfdefault}
% To use a specific Serif font, comment out the following line and
% change palatino to a font of your choice.
%\usepackage{palatino}
% package for page headers
\usepackage{fancyhdr}
\usepackage{lastpage}
% Package for url formatting and also creates hyperlinks.
\usepackage[hidelinks]{hyperref}
% Various parameters
\setlength{\textheight}{229mm}
\setlength{\topmargin}{-5.4mm}
\setlength{\textwidth}{150mm}
% Adjust the following if you want to introduce some indentation for
% margins (e.g. if printing).
\setlength{\oddsidemargin}{0mm}
\setlength{\evensidemargin}{0mm}
\setlength{\headheight}{0pt}
\parskip=8pt
% index and section numbering depths
\setcounter{secnumdepth}{3}
\setcounter{tocdepth}{4}
\newcommand{\is}{\hspace*{0.2in}} %little indent space
\newcommand{\reporttype}[1]{
\newcommand{\showreporttype}{#1}
}
\newcommand{\mmpdocdate}{\today}
\newcommand{\documentdate}[1]{
\renewcommand{\mmpdocdate}{#1}
}
\newcommand{\name}[1]{
\newcommand{\showname}{#1}
}
\newcommand{\userid}[1]{
\newcommand{\showuserid}{#1}
}
\newcommand{\supervisor}[1]{
\newcommand{\showsupervisor}{#1}
}
\newcommand{\supervisorid}[1]{
\newcommand{\showsupervisorid}{#1}
}
\newcommand{\projecttitle}[1]{
\newcommand{\showprojecttitle}{#1}
}
\newcommand{\reporttitle}[1]{
\newcommand{\showreporttitle}{#1}
}
\newcommand{\projecttitlememoir}[1]{
\newcommand{\showprojecttitlememoir}{#1}
}
\newcommand{\version}[1]{
\newcommand{\showversion}{#1}
}
\newcommand{\docstatus}[1]{
\newcommand{\showdocstatus}{#1}
}
\newcommand{\degreeschemecode}[1]{
\newcommand{\showdegreeschemecode}{#1}
}
\newcommand{\degreeschemename}[1]{
\newcommand{\showdegreeschemename}{#1}
}
\newcommand{\modulecode}[1]{
\newcommand{\showmodulecode}{#1}
}
\newcommand{\wordcount}[1]{
\newcommand{\showwordcount}{#1}
}
\newcommand{\helv}{
%bold, 18point, 21point line spacing
\fontfamily{phv}\fontseries{b}\fontsize{18}{21}\selectfont
}
\def\labelenumi{\arabic{enumi}.}
\def\theenumi{\arabic{enumi}}
\def\labelenumii{(\alph{enumii})}
\def\theenumii{\alph{enumii}}
\def\p@enumii{\theenumi}
\def\labelenumiii{\roman{enumiii}.}
\def\theenumiii{\roman{enumiii}}
\def\p@enumiii{\theenumi(\theenumii)}
\def\labelenumiv{\Alph{enumiv}.}
\def\theenumiv{\Alph{enumiv}}
\def\p@enumiv{\p@enumiii\theenumiii}
\def\labelitemi{$\bullet$}
\def\labelitemii{\bf --}
\def\labelitemiii{$\ast$}
\def\labelitemiv{$\cdot$}
\def\verse{\let\\=\@centercr
\list{}{\itemsep\z@ \itemindent -1.5em\listparindent \itemindent
\rightmargin\leftmargin\advance\leftmargin 1.5em}\item[]}
\let\endverse\endlist
\def\quotation{\list{}{\listparindent 1.5em
\itemindent\listparindent
\rightmargin\leftmargin \parsep 0pt plus 1pt}\item[]}
\let\endquotation=\endlist
\def\quote{\list{}{\rightmargin\leftmargin}\item[]}
\let\endquote=\endlist
\def\descriptionlabel#1{\hspace\labelsep \bf #1}
\def\description{\list{}{\labelwidth\z@ \itemindent-\leftmargin
\let\makelabel\descriptionlabel}}
\let\enddescription\endlist
\def\theequation{\arabic{equation}}
\def\titlepage{\@restonecolfalse\if@twocolumn\@restonecoltrue\onecolumn
\else \newpage \fi \thispagestyle{empty}\c@page\z@}
\def\endtitlepage{\if@restonecol\twocolumn \else \newpage \fi}
\arraycolsep 5pt \tabcolsep 6pt \arrayrulewidth .4pt \doublerulesep 2pt
\tabbingsep \labelsep
\skip\@mpfootins = \skip\footins
\fboxsep = 3pt \fboxrule = .4pt
\def\thepart{\Roman{part}}
\def\thesection {\arabic{chapter}.\arabic{section}}
\def\thesubsection {\thesection.\arabic{subsection}}
\def\thesubsubsection {\thesubsection .\arabic{subsubsection}}
\def\theparagraph {\thesubsubsection.\arabic{paragraph}}
\def\thesubparagraph {\theparagraph.\arabic{subparagraph}}
\def\@pnumwidth{1.55em}
\def\@tocrmarg {2.55em}
\def\@dotsep{4.5}
\setcounter{tocdepth}{2}
% set up the information for the title page
% title redefined to also set shorttitle
\def\title#1{\gdef\@title{#1} \gdef\@shorttitle{#1}}
\def\subtitle#1{\gdef\@subtitle{#1}}
\def\authoremail#1{\gdef\@authoremail{#1}}
\def\degreeschemecode#1{\gdef\@degreeschemecode{#1}}
\def\degreeschemetitle#1{\gdef\@degreeschemetitle{#1}}
\def\degreetype#1{\gdef\@degreetype{#1}}
\def\modulecode#1{\gdef\@modulecode{#1}}
\def\moduletitle#1{\gdef\@moduletitle{#1}}
\def\supervisor#1{\gdef\@supervisor{#1}}
\def\supervisoremail#1{\gdef\@supervisoremail{#1}}
\def\shorttitle#1{\gdef\@shorttitle{#1}}
\def\client#1{\gdef\@client{{\it Client:} & #1 \\}} \gdef\@client{}
\def\configref#1{\gdef\@configref{#1}}
\def\projref#1{\gdef\@projref{{\it Proj. Ref.:} & #1 \\}} \gdef\@projref{}
\def\version#1{\gdef\@version{#1}}
\def\status#1{\gdef\@status{#1}}
% set parskip to zero locally within the lists
\def\tableofcontents{\section*{\centering Contents}
{\parskip=0pt \@starttoc{toc}}}
\def\l@part#1#2{\addpenalty{\@secpenalty}
\addvspace{2.25em plus 1pt} \begingroup
\@tempdima 3em \parindent \z@ \rightskip \@pnumwidth \parfillskip
-\@pnumwidth
{\large \bf \leavevmode #1\hfil \hbox to\@pnumwidth{\hss #2}}\par
\nobreak \endgroup}
%\def\l@section#1#2{\addpenalty{\@secpenalty} \addvspace{1.0em plus 1pt}
%\@tempdima 1.5em \begingroup
% \parindent \z@ \rightskip \@pnumwidth
% \parfillskip -\@pnumwidth
% \bf \leavevmode #1\hfil \hbox to\@pnumwidth{\hss #2}\par
% \endgroup}
\def\l@section{\@dottedtocline{1}{1.5em}{2.3em}}
\def\l@subsection{\@dottedtocline{2}{3.8em}{2.9em}}
\def\l@subsubsection{\@dottedtocline{3}{3.8em}{3.2em}}
\def\l@paragraph{\@dottedtocline{4}{7.0em}{4.1em}}
\def\l@subparagraph{\@dottedtocline{5}{10em}{5em}}
\def\listoffigures{\section*{List of Figures}
{\parskip=0pt \@starttoc{lof}}}
\def\l@figure{\@dottedtocline{1}{1.5em}{2.3em}}
\def\listoftables{\section*{List of Tables}
{\parskip=0pt \@starttoc{lot}}}
\let\l@table\l@figure
%%%% Title Page %%%%
\def\maketitle{\par
\begingroup
\def\thefootnote{\fnsymbol{footnote}}
\def\@makefnmark{\hbox
to 0pt{$^{\@thefnmark}$\hss}}
\if@twocolumn
\twocolumn[\@maketitle]
\else \newpage
\global\@topnum\z@ \@maketitle \fi\thispagestyle{myheadings}\@thanks
\endgroup
\setcounter{footnote}{0}
\let\maketitle\relax
\let\@maketitle\relax
\gdef\@thanks{}\gdef\@author{}\gdef\@title{}\let\thanks\relax}
\def\@maketitle{{\headheight 12pt\newpage
\thispagestyle{empty}\c@page\z@
\null
\vskip 0.4in
\hskip 0.3in
\begin{minipage}{5.8in}
\vskip 0.3in
\centering\huge{\bf\textcolor{mmpTitle}\@title\par}
\normalsize
\vskip 0.3in
\centering {\@modulecode} {\@moduletitle} Report
\vskip 0.4in
\centering Author:\hskip 0.05in \@author\hskip 0.05in (\@authoremail)
\vskip 0.05in
\centering Supervisor:\hskip 0.05in \@supervisor\hskip 0.05in (\@supervisoremail)
\vskip 0.3in
\centering {\@date} \vskip 0.05in
\centering Version: {\@version} ({\@status})
\vskip 0.6in
This report was submitted as partial fulfilment of a {\@degreetype} degree in
{\@degreeschemetitle} ({\@degreeschemecode})
\end{minipage}
\vfill
\hskip 0.3in
\begin{minipage}{5.2in}
\begin{tabular}{l}
Department of Computer Science\\
Aberystwyth University\\
Aberystwyth \\
Ceredigion \\
SY23 3DB \\
Wales, U.K. \\
\end{tabular}
\end{minipage}
\newpage}}
%%%% end of Title Page %%%%
%%%%% Header and Footer %%%%
\def\setchapterheaderfooter{
%%%% The following block affects the header and footer on the first page of a chapter
\fancypagestyle{plain}{%
%\fancyhf{} % clear all header and footer fields
\fancyhead[L]{Chapter\ \thechapter}
\fancyhead[R]{\leftmark}
\fancyfoot[C]{{\thepage} of \pageref{LastPage}} % except the center
\renewcommand{\headrulewidth}{0.2pt}
\renewcommand{\footrulewidth}{0pt}}
%%%%
%%%% Setting the header and footer for pages in a section, other than the first page
\fancyhead[L]{Chapter\ \thechapter}
\fancyhead[R]{\leftmark}
\fancyfoot[C]{{\thepage} of \pageref{LastPage}}
\renewcommand{\headrulewidth}{0.2pt}
\renewcommand{\footrulewidth}{0pt}
%%%%
}
\def\setemptyheader{
\fancypagestyle{plain}{%
\fancyhead{}}
\renewcommand{\headrulewidth}{0pt}
}
%%%%% End of Header and Footer %%%%
%%%% Paragraph Indentation %%%%%
\setlength{\parindent}{0in}
%%%% end of Paragraph Indentation %%%%

View file

@ -0,0 +1,576 @@
@article{dcnn,
title = {Photo {Aesthetics} {Analysis} via {DCNN} {Feature} {Encoding}},
volume = {19},
issn = {1941-0077},
doi = {10.1109/TMM.2017.2687759},
abstract = {We propose an automatic framework for quality assessment of a photograph as well as analysis of its aesthetic attributes. In contrast to the previous methods that rely on manually designed features to account for photo aesthetics, our method automatically extracts such features using a pretrained deep convolutional neural network (DCNN). To make the DCNN-extracted features more suited to our target tasks of photo quality assessment and aesthetic attribute analysis, we propose a novel feature encoding scheme, which supports vector machines-driven sparse restricted Boltzmann machines, which enhances sparseness of features and discrimination between target classes. Experimental results show that our method outperforms the current state-of-the-art methods in automatic photo quality assessment, and gives aesthetic attribute ratings that can be used for photo editing. We demonstrate that our feature encoding scheme can also be applied to general object classification task to achieve performance gains.},
number = {8},
journal = {IEEE Transactions on Multimedia},
author = {Lee, Hui-Jin and Hong, Ki-Sang and Kang, Henry and Lee, Seungyong},
month = aug,
year = {2017},
note = {Conference Name: IEEE Transactions on Multimedia},
keywords = {Aesthetic attributes, deep convolutional neural network (DCNN), Encoding, feature encoding, Feature extraction, Mathematical model, Neural networks, photo aesthetics, Quality assessment, restricted Boltzmann machines, Support vector machines, Training},
pages = {1921--1932},
file = {IEEE Xplore Abstract Record:/home/noble/Zotero/storage/E6YUFLQE/7886320.html:text/html},
}
@inproceedings{ava_paper,
title = {{AVA}: {A} large-scale database for aesthetic visual analysis},
shorttitle = {{AVA}},
doi = {10.1109/CVPR.2012.6247954},
abstract = {With the ever-expanding volume of visual content available, the ability to organize and navigate such content by aesthetic preference is becoming increasingly important. While still in its nascent stage, research into computational models of aesthetic preference already shows great potential. However, to advance research, realistic, diverse and challenging databases are needed. To this end, we introduce a new large-scale database for conducting Aesthetic Visual Analysis: AVA. It contains over 250,000 images along with a rich variety of meta-data including a large number of aesthetic scores for each image, semantic labels for over 60 categories as well as labels related to photographic style. We show the advantages of AVA with respect to existing databases in terms of scale, diversity, and heterogeneity of annotations. We then describe several key insights into aesthetic preference afforded by AVA. Finally, we demonstrate, through three applications, how the large scale of AVA can be leveraged to improve performance on existing preference tasks.},
booktitle = {2012 {IEEE} {Conference} on {Computer} {Vision} and {Pattern} {Recognition}},
author = {Murray, Naila and Marchesotti, Luca and Perronnin, Florent},
month = jun,
year = {2012},
note = {ISSN: 1063-6919},
keywords = {Communities, Image color analysis, Semantics, Social network services, Visual databases, Visualization},
pages = {2408--2415},
file = {IEEE Xplore Abstract Record:/home/noble/Zotero/storage/HE3EFVJV/6247954.html:text/html},
}
@misc{kong_photo_2022,
title = {Photo {Aesthetics} {Ranking} {Network} with {Attributes} and {Content} {Adaptation}},
url = {https://github.com/aimerykong/deepImageAestheticsAnalysis},
abstract = {ECCV2016 - fine-grained photo aesthetics rating with interpretability},
urldate = {2022-02-11},
author = {Kong, Shu},
month = jan,
year = {2022},
note = {original-date: 2016-06-05T06:08:10Z},
}
@misc{pytorch,
title = {{PyTorch}.org},
url = {https://www.pytorch.org},
abstract = {An open source machine learning framework that accelerates the path from research prototyping to production deployment.},
language = {en},
urldate = {2022-02-11},
author = {{Adam Paszke} and {Sam Gross} and {Soumith Chintala} and {Gregory Chanan}},
file = {Snapshot:/home/noble/Zotero/storage/HPSFJGU3/pytorch.org.html:text/html},
}
@misc{opencv,
title = {{OpenCV}},
url = {https://opencv.org},
abstract = {OpenCV provides a real-time optimized Computer Vision library, tools, and hardware. It also supports model execution for Machine Learning (ML) and Artificial Intelligence (AI).},
language = {en-US},
urldate = {2022-02-11},
journal = {OpenCV},
author = {{Intel Corporation} and {Willow Garage} and {Itseez}},
file = {Snapshot:/home/noble/Zotero/storage/JYX6KIPC/opencv.org.html:text/html},
}
@misc{ava_dataset,
title = {{AVA} {Dataset}},
url = {https://github.com/imfing/ava_downloader},
abstract = {:arrow\_double\_down: Download AVA dataset (A Large-Scale Database for Aesthetic Visual Analysis)},
urldate = {2022-02-11},
author = {Fing},
month = feb,
year = {2022},
note = {original-date: 2016-11-13T02:20:32Z},
keywords = {aesthetic-visual-analysis, ava, computer-vision, dataset, python},
}
@misc{noauthor_image_2022,
title = {Image {Quality} {Assessment}},
copyright = {Apache-2.0},
url = {https://github.com/idealo/image-quality-assessment},
abstract = {Convolutional Neural Networks to predict the aesthetic and technical quality of images.},
urldate = {2022-02-11},
publisher = {idealo},
month = feb,
year = {2022},
note = {original-date: 2018-06-12T14:46:09Z},
keywords = {computer-vision, aws, convolutional-neural-networks, deep-learning, e-commerce, idealo, image-quality-assessment, keras, machine-learning, mobilenet, neural-network, nima, tensorflow},
}
@misc{tensorflow,
title = {{TensorFlow}.org},
url = {https://www.tensorflow.org/},
urldate = {2022-02-11},
author = {{Google Brain Team}},
}
@misc{noauthor_most_nodate,
title = {Most used social media 2021},
url = {https://www.statista.com/statistics/272014/global-social-networks-ranked-by-number-of-users/},
abstract = {Facebook, YouTube, and WhatsApp are the most popular social networks worldwide, each with at least two billion active users.},
language = {en},
urldate = {2022-04-26},
journal = {Statista},
file = {Snapshot:/home/noble/Zotero/storage/WQGJ88E7/global-social-networks-ranked-by-number-of-users.html:text/html},
}
@misc{trophy_camera,
title = {Trophy {Camera}, 2017-2022},
url = {https://driesdepoorter.be/trophy-camera/},
abstract = {Trophy Camera is a photo camera that can only make award winning pictures. Just take your photo and check if the camera sees your picture as award winning. This A.I. powered camera has been trained by all previous winning World Press Photos of the year. Based on the identification of labeled patterns, the camera is…},
language = {en-US},
urldate = {2022-04-26},
journal = {Dries Depoorter},
author = {Depoorter, Dries and Pinckers, Max},
file = {Snapshot:/home/noble/Zotero/storage/PV8ZQT4J/trophy-camera.html:text/html},
}
@misc{trophy_camera_site,
title = {"trophy.camera"},
url = {https://trophy.camera/},
urldate = {2022-04-27},
author = {Depoorter, Dries and Pinckers, Max},
file = {"trophy.camera":/home/noble/Zotero/storage/7PYMYIDB/trophy.camera.html:text/html},
}
@misc{wpp,
title = {Home {\textbar} {World} {Press} {Photo}},
url = {https://www.worldpressphoto.org/},
urldate = {2022-04-27},
file = {Home | World Press Photo:/home/noble/Zotero/storage/B8EU43WT/www.worldpressphoto.org.html:text/html},
}
@misc{pyis_laplace,
title = {Blur detection with {OpenCV}},
url = {https://www.pyimagesearch.com/2015/09/07/blur-detection-with-opencv/},
abstract = {In this tutorial, I will teach you how to detect the amount of blur in an image using OpenCV and Python. Perform blur detection using the OpenCV library.},
language = {en-US},
urldate = {2022-04-27},
journal = {PyImageSearch},
month = sep,
year = {2015},
file = {Snapshot:/home/noble/Zotero/storage/KI46W39W/blur-detection-with-opencv.html:text/html},
}
@misc{pyis_fft,
title = {{OpenCV} {Fast} {Fourier} {Transform} ({FFT}) for blur detection in images and video streams},
url = {https://www.pyimagesearch.com/2020/06/15/opencv-fast-fourier-transform-fft-for-blur-detection-in-images-and-video-streams/},
abstract = {In this tutorial, you will learn how to use OpenCV and the Fast Fourier Transform (FFT) to perform blur detection in images and real-time video streams.},
language = {en-US},
urldate = {2022-04-27},
journal = {PyImageSearch},
month = jun,
year = {2020},
}
@misc{python,
title = {Python.org},
url = {https://www.python.org/},
abstract = {The official home of the Python Programming Language},
language = {en},
urldate = {2022-04-27},
journal = {Python.org},
file = {Snapshot:/home/noble/Zotero/storage/PXIQFZZL/www.python.org.html:text/html},
}
@misc{keras,
title = {Keras: the {Python} deep learning {API}},
url = {https://keras.io/},
urldate = {2022-04-27},
file = {Keras\: the Python deep learning API:/home/noble/Zotero/storage/98IXDCT3/keras.io.html:text/html},
}
@misc{google_photos,
title = {Updating {Google} {Photos} storage policy to build for the future},
url = {https://blog.google/products/photos/storage-changes/},
abstract = {In order to welcome even more of your memories and build Google Photos for the future, we are changing our unlimited High quality storage policy.},
language = {en-us},
urldate = {2022-05-02},
journal = {Google},
month = nov,
year = {2020},
file = {Snapshot:/home/noble/Zotero/storage/PBDLGX6I/storage-changes.html:text/html},
}
@misc{google_workspace,
title = {G {Suite}'s unlimited {Google} {Drive} storage will be discontinued with {Workspace}},
url = {https://9to5google.com/2020/10/08/google-workspace-drive-storage-limits/},
abstract = {While G Suite offered it to businesses, Google Workspace has completely dropped the option to have unlimited storage on Google Drive.},
language = {en-US},
urldate = {2022-05-02},
journal = {9to5Google},
author = {Schoon, Ben},
month = oct,
year = {2020},
file = {Snapshot:/home/noble/Zotero/storage/RCM4U3V4/google-workspace-drive-storage-limits.html:text/html},
}
@misc{instagram_aesthetic,
title = {Why {Your} {Instagram} {Profile} is the {New} {Home} {Page}},
url = {https://later.com/blog/instagram-profile-home-page/},
abstract = {It's time to treat your Instagram profile as seriously as the home page of your website. Learn how to optimize your Instagram profile to convert visitors into followers!},
language = {en},
urldate = {2022-05-02},
file = {Snapshot:/home/noble/Zotero/storage/HEUXGVXA/instagram-profile-home-page.html:text/html},
}
@misc{instagram_planner,
title = {{PREVIEW} - {Plan} your {Instagram} {Apps} on {Google} {Play}},
url = {https://play.google.com/store/apps/details?id=com.sensio.instapreview&hl=en_GB&gl=GB},
abstract = {ALL-IN-ONE Instagram tool: Presets, themes, plan, analytics, schedule \& repost.},
language = {en},
urldate = {2022-05-02},
file = {Snapshot:/home/noble/Zotero/storage/8KHZSN73/details.html:text/html},
}
@misc{unum,
title = {{UNUM} — {World}'s easiest marketing tool},
url = {https://www.unum.la/},
abstract = {Save 7 hours every week with UNUM — your all-in-one design, planning, and marketing platform for your most important channels. Loved by millions of creators, small businesses, agencies, and marketing teams worldwide.},
urldate = {2022-05-02},
file = {Snapshot:/home/noble/Zotero/storage/SK45SUA4/www.unum.la.html:text/html},
}
@misc{golden_ratio,
title = {Golden ratio: {A} beginner's guide {\textbar} {Adobe}},
shorttitle = {Golden ratio},
url = {https://www.adobe.com/creativecloud/design/discover/golden-ratio.html},
abstract = {The golden ration is one of the most famous ratios in mathematics and design. Learn more about the golden ratio and its role in art and design.},
language = {en-US},
urldate = {2022-05-02},
file = {Snapshot:/home/noble/Zotero/storage/6CMQQJM8/golden-ratio.html:text/html},
}
@misc{symmetry,
title = {Symmetry in {Photography}: {The} {Ultimate} {Guide} to {Using} {Symmetry} in {Your} {Photos}},
shorttitle = {Symmetry in {Photography}},
url = {https://www.photoworkout.com/symmetry-in-photography/},
abstract = {Symmetry in photography makes for bold, in-your-face compositions. Discover how to use symmetry for the best results!},
language = {en-US},
urldate = {2022-05-02},
journal = {PhotoWorkout},
month = sep,
year = {2020},
file = {Snapshot:/home/noble/Zotero/storage/5KV5F7K8/symmetry-in-photography.html:text/html},
}
@misc{rule-of-thirds,
title = {What is the {Rule} of {Thirds} and {How} to {Use} it to {Improve} {Your} {Photos}},
url = {https://photographylife.com/the-rule-of-thirds},
abstract = {Read this detailed article to understand how the rule of thirds works, when to and when not to use it when composing images.},
language = {en-US},
urldate = {2022-05-02},
journal = {Photography Life},
month = mar,
year = {2018},
file = {Snapshot:/home/noble/Zotero/storage/GKPUDLZK/the-rule-of-thirds.html:text/html},
}
@misc{dof,
title = {Understanding shallow depth of field photography {\textbar} {Adobe}},
url = {https://www.adobe.com/creativecloud/photography/discover/shallow-depth-of-field.html},
abstract = {Shallow depth of field is achieved by shooting photographs with a low f-number to let in more light. Learn more about shallow depth of field today.},
language = {en-US},
urldate = {2022-05-02},
file = {Snapshot:/home/noble/Zotero/storage/LKK7R2RN/shallow-depth-of-field.html:text/html},
}
@misc{flickr,
title = {Flickr},
url = {https://www.flickr.com},
urldate = {2022-05-02},
file = {2007 Spencer Platt WY | World Press Photo:/home/noble/Zotero/storage/I8L8PC7M/1.html:text/html},
}
@misc{aqa-on-ava,
title = {Papers with {Code} - {AVA} {Benchmark} ({Aesthetics} {Quality} {Assessment})},
url = {https://paperswithcode.com/sota/aesthetics-quality-assessment-on-ava},
abstract = {The current state-of-the-art on AVA is MP\_adam. See a full comparison of 9 papers with code.},
language = {en},
urldate = {2022-05-03},
file = {Snapshot:/home/noble/Zotero/storage/LMRI5CPK/aesthetics-quality-assessment-on-ava.html:text/html},
}
@misc{archillect,
title = {Archillect},
url = {https://archillect.com/},
abstract = {The Ocular Engine},
language = {en},
urldate = {2022-05-03},
file = {Snapshot:/home/noble/Zotero/storage/Q8V5TTS5/archillect.com.html:text/html},
}
@misc{noauthor_theres_nodate,
title = {Theres a way to pick the absolute best images for your content: {Apply} {AI} {\textbar} {TechCrunch}},
url = {https://techcrunch.com/2020/10/01/theres-a-way-to-pick-the-absolute-best-images-for-your-content-apply-ai/?guccounter=1},
urldate = {2022-05-03},
file = {Theres a way to pick the absolute best images for your content\: Apply AI | TechCrunch:/home/noble/Zotero/storage/TKV3EA5U/theres-a-way-to-pick-the-absolute-best-images-for-your-content-apply-ai.html:text/html},
}
@article{intrinsic_image_popularity,
title = {Intrinsic {Image} {Popularity} {Assessment}},
url = {http://arxiv.org/abs/1907.01985},
doi = {10.1145/3343031.3351007},
abstract = {The goal of research in automatic image popularity assessment (IPA) is to develop computational models that can accurately predict the potential of a social image to go viral on the Internet. Here, we aim to single out the contribution of visual content to image popularity, i.e., intrinsic image popularity. Specifically, we first describe a probabilistic method to generate massive popularity-discriminable image pairs, based on which the first large-scale image database for intrinsic IPA (I2PA) is established. We then develop computational models for I2PA based on deep neural networks, optimizing for ranking consistency with millions of popularity-discriminable image pairs. Experiments on Instagram and other social platforms demonstrate that the optimized model performs favorably against existing methods, exhibits reasonable generalizability on different databases, and even surpasses human-level performance on Instagram. In addition, we conduct a psychophysical experiment to analyze various aspects of human behavior in I2PA.},
language = {en},
urldate = {2022-05-03},
journal = {Proceedings of the 27th ACM International Conference on Multimedia},
author = {Ding, Keyan and Ma, Kede and Wang, Shiqi},
month = oct,
year = {2019},
note = {arXiv: 1907.01985},
keywords = {Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia},
pages = {1979--1987},
file = {Ding et al. - 2019 - Intrinsic Image Popularity Assessment.pdf:/home/noble/Zotero/storage/69F498QV/Ding et al. - 2019 - Intrinsic Image Popularity Assessment.pdf:application/pdf},
}
@misc{pytorch-rocm,
title = {{PyTorch} for {AMD} {ROCm}™ {Platform} now available as {Python} package},
url = {https://pytorch.org/blog/pytorch-for-amd-rocm-platform-now-available-as-python-package/},
abstract = {An open source machine learning framework that accelerates the path from research prototyping to production deployment.},
language = {en},
urldate = {2022-05-03},
file = {Snapshot:/home/noble/Zotero/storage/K88CKU33/pytorch-for-amd-rocm-platform-now-available-as-python-package.html:text/html},
}
@misc{gcp-gpu,
title = {Create a {VM} with attached {GPUs} {\textbar} {Compute} {Engine} {Documentation} {\textbar} {Google} {Cloud}},
url = {https://cloud.google.com/compute/docs/gpus/create-vm-with-gpus},
language = {en},
urldate = {2022-05-03},
file = {Snapshot:/home/noble/Zotero/storage/EIDYTHME/create-vm-with-gpus.html:text/html},
}
@misc{slurm,
title = {Slurm {Workload} {Manager} - {Overview}},
url = {https://slurm.schedmd.com/overview.html},
urldate = {2022-05-03},
file = {Slurm Workload Manager - Overview:/home/noble/Zotero/storage/53EBVW79/overview.html:text/html},
}
@misc{anaconda,
title = {Anaconda {\textbar} {The} {World}'s {Most} {Popular} {Data} {Science} {Platform}},
url = {https://www.anaconda.com/},
abstract = {Anaconda is the birthplace of Python data science. We are a movement of data scientists, data-driven enterprises, and open source communities.},
language = {en},
urldate = {2022-05-03},
journal = {Anaconda},
file = {Snapshot:/home/noble/Zotero/storage/QI8YM43G/www.anaconda.com.html:text/html},
}
@misc{rocm-ml,
title = {{ROCm}™: {Machine} {Learning}},
shorttitle = {{ROCm}™},
url = {https://www.amd.com/en/graphics/servers-solutions-rocm-ml},
language = {en},
urldate = {2022-05-03},
file = {Snapshot:/home/noble/Zotero/storage/FP6GHSEP/servers-solutions-rocm-ml.html:text/html},
}
@misc{pyis-cnn,
title = {{PyTorch}: {Training} your first {Convolutional} {Neural} {Network} ({CNN})},
shorttitle = {{PyTorch}},
url = {https://www.pyimagesearch.com/2021/07/19/pytorch-training-your-first-convolutional-neural-network-cnn/},
abstract = {In this tutorial, you will receive a gentle introduction to training your first Convolutional Neural Network (CNN) using the PyTorch deep learning library.},
language = {en-US},
urldate = {2022-05-04},
journal = {PyImageSearch},
month = jul,
year = {2021},
file = {Snapshot:/home/noble/Zotero/storage/A79TLMS6/pytorch-training-your-first-convolutional-neural-network-cnn.html:text/html},
}
@misc{dpchallenge,
title = {{DPChallenge} - {A} {Digital} {Photography} {Contest}},
url = {https://www.dpchallenge.com/},
urldate = {2022-05-04},
file = {DPChallenge - A Digital Photography Contest:/home/noble/Zotero/storage/8HYS69YJ/www.dpchallenge.com.html:text/html},
}
@misc{wpp-spencer,
title = {2007 {Spencer} {Platt} {WY} {\textbar} {World} {Press} {Photo}},
url = {https://www.worldpressphoto.org/collection/photo-contest/2007/spencer-platt/1},
urldate = {2022-05-02},
}
@article{aadb_paper,
title = {Photo {Aesthetics} {Ranking} {Network} with {Attributes} and {Content} {Adaptation}},
url = {http://arxiv.org/abs/1606.01621},
abstract = {Real-world applications could benefit from the ability to automatically generate a fine-grained ranking of photo aesthetics. However, previous methods for image aesthetics analysis have primarily focused on the coarse, binary categorization of images into high- or low-aesthetic categories. In this work, we propose to learn a deep convolutional neural network to rank photo aesthetics in which the relative ranking of photo aesthetics are directly modeled in the loss function. Our model incorporates joint learning of meaningful photographic attributes and image content information which can help regularize the complicated photo aesthetics rating problem.},
language = {en},
urldate = {2022-05-04},
journal = {arXiv:1606.01621 [cs]},
author = {Kong, Shu and Shen, Xiaohui and Lin, Zhe and Mech, Radomir and Fowlkes, Charless},
month = jul,
year = {2016},
note = {arXiv: 1606.01621},
keywords = {Computer Science - Computer Vision and Pattern Recognition, Computer Science - Multimedia, Computer Science - Information Retrieval},
file = {Kong et al. - 2016 - Photo Aesthetics Ranking Network with Attributes a.pdf:/home/noble/Zotero/storage/EXJXBA6X/Kong et al. - 2016 - Photo Aesthetics Ranking Network with Attributes a.pdf:application/pdf},
}
@misc{ml-module,
title = {Current {Modules} by {Department} : {Modules} , {Aberystwyth} {University}},
shorttitle = {Current {Modules} by {Department}},
url = {https://www.aber.ac.uk/en/modules/deptcurrent/CS36220/},
abstract = {Information about current modules offered available at Aberystwyth University, listed by department.},
language = {en},
urldate = {2022-05-04},
author = {Team, Aberystwyth University Web},
note = {Last Modified: 2022-05-04},
file = {Snapshot:/home/noble/Zotero/storage/YEJGV5C2/CS36220.html:text/html},
}
@misc{stanford_dl,
title = {Lecture 8 {\textbar} {Deep} {Learning} {Software}},
url = {https://www.youtube.com/watch?v=6SlgtELqOWc},
abstract = {In Lecture 8 we discuss the use of different software packages for deep learning, focusing on TensorFlow and PyTorch. We also discuss some differences between CPUs and GPUs.
Keywords: CPU vs GPU, TensorFlow, Keras, Theano, Torch, PyTorch, Caffe, Caffe2, dynamic vs static computational graphs
Slides: http://cs231n.stanford.edu/slides/201...
--------------------------------------------------------------------------------------
Convolutional Neural Networks for Visual Recognition
Instructors:
Fei-Fei Li: http://vision.stanford.edu/feifeili/
Justin Johnson: http://cs.stanford.edu/people/jcjohns/
Serena Yeung: http://ai.stanford.edu/{\textasciitilde}syyeung/
Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This lecture collection is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. From this lecture collection, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision.
Website:
http://cs231n.stanford.edu/
For additional learning opportunities please visit:
http://online.stanford.edu/},
urldate = {2022-05-04},
author = {{Stanford University School of Engineering}},
month = aug,
year = {2017},
}
@misc{stanford_cnn,
title = {Lecture 9 {\textbar} {CNN} {Architectures}},
url = {https://www.youtube.com/watch?v=DAOcjicFr1Y},
abstract = {In Lecture 9 we discuss some common architectures for convolutional neural networks. We discuss architectures which performed well in the ImageNet challenges, including AlexNet, VGGNet, GoogLeNet, and ResNet, as well as other interesting models.
Keywords: AlexNet, VGGNet, GoogLeNet, ResNet, Network in Network, Wide ResNet, ResNeXT, Stochastic Depth, DenseNet, FractalNet, SqueezeNet
Slides: http://cs231n.stanford.edu/slides/201...
--------------------------------------------------------------------------------------
Convolutional Neural Networks for Visual Recognition
Instructors:
Fei-Fei Li: http://vision.stanford.edu/feifeili/
Justin Johnson: http://cs.stanford.edu/people/jcjohns/
Serena Yeung: http://ai.stanford.edu/{\textasciitilde}syyeung/
Computer Vision has become ubiquitous in our society, with applications in search, image understanding, apps, mapping, medicine, drones, and self-driving cars. Core to many of these applications are visual recognition tasks such as image classification, localization and detection. Recent developments in neural network (aka “deep learning”) approaches have greatly advanced the performance of these state-of-the-art visual recognition systems. This lecture collection is a deep dive into details of the deep learning architectures with a focus on learning end-to-end models for these tasks, particularly image classification. From this lecture collection, students will learn to implement, train and debug their own neural networks and gain a detailed understanding of cutting-edge research in computer vision.
Website:
http://cs231n.stanford.edu/
For additional learning opportunities please visit:
http://online.stanford.edu/},
urldate = {2022-05-04},
author = {{Stanford University School of Engineering}},
month = aug,
year = {2017},
}
@misc{pytorch-training,
title = {Training with {PyTorch} — {PyTorch} {Tutorials} 1.11.0+cu102 documentation},
url = {https://pytorch.org/tutorials/beginner/introyt/trainingyt.html},
urldate = {2022-05-04},
file = {Training with PyTorch — PyTorch Tutorials 1.11.0+cu102 documentation:/home/noble/Zotero/storage/ZPY5XLIN/trainingyt.html:text/html},
}
@misc{imagenet,
title = {{ImageNet}},
url = {https://www.image-net.org/index.php},
urldate = {2022-05-04},
file = {ImageNet:/home/noble/Zotero/storage/AZR8PDKP/index.html:text/html},
}
@misc{skimage,
title = {skimage — skimage v0.19.2 docs},
url = {https://scikit-image.org/docs/stable/api/skimage.html},
urldate = {2022-05-04},
file = {skimage — skimage v0.19.2 docs:/home/noble/Zotero/storage/K8SAES3B/skimage.html:text/html},
}
@misc{unsplash,
title = {Photo by {ZHENYU} {LUO} on {Unsplash}},
url = {https://unsplash.com/photos/mhvL46eQis0},
abstract = {Download this photo by ZHENYU LUO on Unsplash},
language = {en},
urldate = {2022-05-05},
author = {Unsplash},
file = {Snapshot:/home/noble/Zotero/storage/EBF2RRHE/mhvL46eQis0.html:text/html},
}
@misc{gimp,
title = {{GIMP}},
url = {https://www.gimp.org/},
abstract = {GIMP - The GNU Image Manipulation Program: The Free and Open Source Image Editor},
language = {en},
urldate = {2022-05-05},
journal = {GIMP},
file = {Snapshot:/home/noble/Zotero/storage/KNSWAL4N/www.gimp.org.html:text/html},
}
@misc{pyis_contrast,
title = {Detecting low contrast images with {OpenCV}, scikit-image, and {Python}},
url = {https://www.pyimagesearch.com/2021/01/25/detecting-low-contrast-images-with-opencv-scikit-image-and-python/},
abstract = {In this tutorial you will learn how to detect low contrast images using OpenCV and scikit-image.},
language = {en-US},
urldate = {2022-05-06},
journal = {PyImageSearch},
month = jan,
year = {2021},
file = {Snapshot:/home/noble/Zotero/storage/WNHWPBW8/detecting-low-contrast-images-with-opencv-scikit-image-and-python.html:text/html},
}
@misc{pyis_keras,
title = {Regression with {Keras}},
url = {https://www.pyimagesearch.com/2019/01/21/regression-with-keras/},
abstract = {In this tutorial you will learn how to perform regression using Keras. You will learn how to train a Keras neural network for regression and continuous value prediction, specifically in the context of house price prediction.},
language = {en-US},
urldate = {2022-05-06},
journal = {PyImageSearch},
month = jan,
year = {2019},
file = {Snapshot:/home/noble/Zotero/storage/DBW7BKZV/regression-with-keras.html:text/html},
}
@misc{pyis_keras_cnn,
title = {Keras, {Regression}, and {CNNs}},
url = {https://www.pyimagesearch.com/2019/01/28/keras-regression-and-cnns/},
abstract = {In this tutorial you will learn how to train a Convolutional Neural Network (CNN) for regression prediction with Keras and deep learning.},
language = {en-US},
urldate = {2022-05-06},
journal = {PyImageSearch},
month = jan,
year = {2019},
file = {Snapshot:/home/noble/Zotero/storage/8CWZPFAB/keras-regression-and-cnns.html:text/html},
}
@misc{house_dataset,
title = {House {Prices} - {Advanced} {Regression} {Techniques}},
url = {https://kaggle.com/competitions/house-prices-advanced-regression-techniques},
abstract = {Predict sales prices and practice feature engineering, RFs, and gradient boosting},
language = {en},
urldate = {2022-05-06},
file = {Snapshot:/home/noble/Zotero/storage/IYF6ZKAL/data.html:text/html},
}
@misc{pytorch-resnet,
title = {{PyTorch}: {ResNet}},
url = {https://pytorch.org/hub/pytorch_vision_resnet/},
abstract = {An open source machine learning framework that accelerates the path from research prototyping to production deployment.},
language = {en},
urldate = {2022-05-06},
file = {Snapshot:/home/noble/Zotero/storage/ZRCB5F9V/pytorch_vision_resnet.html:text/html},
}
@misc{pyis_pytorch_transferlearning,
title = {{PyTorch}: {Transfer} {Learning} and {Image} {Classification} - {PyImageSearch}},
url = {https://pyimagesearch.com/2021/10/11/pytorch-transfer-learning-and-image-classification/},
urldate = {2022-05-06},
file = {PyTorch\: Transfer Learning and Image Classification - PyImageSearch:/home/noble/Zotero/storage/MK2AS3NN/pytorch-transfer-learning-and-image-classification.html:text/html},
}

BIN
docs/report/img/AVA-entry.pdf (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/HL-Architecture-Diagram.pdf (Stored with Git LFS) Normal file

Binary file not shown.

View file

@ -0,0 +1,58 @@
%% Creator: Inkscape 1.1.2 (0a00cf5339, 2022-02-04, custom), www.inkscape.org
%% PDF/EPS/PS + LaTeX output extension by Johan Engelen, 2010
%% Accompanies image file 'HL-Architecture-Diagram.pdf' (pdf, eps, ps)
%%
%% To include the image in your LaTeX document, write
%% \input{<filename>.pdf_tex}
%% instead of
%% \includegraphics{<filename>.pdf}
%% To scale the image, write
%% \def\svgwidth{<desired width>}
%% \input{<filename>.pdf_tex}
%% instead of
%% \includegraphics[width=<desired width>]{<filename>.pdf}
%%
%% Images with a different path to the parent latex file can
%% be accessed with the `import' package (which may need to be
%% installed) using
%% \usepackage{import}
%% in the preamble, and then including the image with
%% \import{<path to file>}{<filename>.pdf_tex}
%% Alternatively, one can specify
%% \graphicspath{{<path to file>/}}
%%
%% For more information, please see info/svg-inkscape on CTAN:
%% http://tug.ctan.org/tex-archive/info/svg-inkscape
%%
\begingroup%
\makeatletter%
\providecommand\color[2][]{%
\errmessage{(Inkscape) Color is used for the text in Inkscape, but the package 'color.sty' is not loaded}%
\renewcommand\color[2][]{}%
}%
\providecommand\transparent[1]{%
\errmessage{(Inkscape) Transparency is used (non-zero) for the text in Inkscape, but the package 'transparent.sty' is not loaded}%
\renewcommand\transparent[1]{}%
}%
\providecommand\rotatebox[2]{#2}%
\newcommand*\fsize{\dimexpr\f@size pt\relax}%
\newcommand*\lineheight[1]{\fontsize{\fsize}{#1\fsize}\selectfont}%
\ifx\svgwidth\undefined%
\setlength{\unitlength}{1582.18444824bp}%
\ifx\svgscale\undefined%
\relax%
\else%
\setlength{\unitlength}{\unitlength * \real{\svgscale}}%
\fi%
\else%
\setlength{\unitlength}{\svgwidth}%
\fi%
\global\let\svgwidth\undefined%
\global\let\svgscale\undefined%
\makeatother%
\begin{picture}(1,0.34802557)%
\lineheight{1}%
\setlength\tabcolsep{0pt}%
\put(0,0){\includegraphics[width=\unitlength,page=1]{HL-Architecture-Diagram.pdf}}%
\end{picture}%
\endgroup%

File diff suppressed because it is too large Load diff

After

Width:  |  Height:  |  Size: 140 KiB

BIN
docs/report/img/HL-Diagram-CNN-Simplified.pdf (Stored with Git LFS) Normal file

Binary file not shown.

View file

@ -0,0 +1,58 @@
%% Creator: Inkscape 1.1.2 (0a00cf5339, 2022-02-04, custom), www.inkscape.org
%% PDF/EPS/PS + LaTeX output extension by Johan Engelen, 2010
%% Accompanies image file 'HL-Diagram-CNN-Simplified.pdf' (pdf, eps, ps)
%%
%% To include the image in your LaTeX document, write
%% \input{<filename>.pdf_tex}
%% instead of
%% \includegraphics{<filename>.pdf}
%% To scale the image, write
%% \def\svgwidth{<desired width>}
%% \input{<filename>.pdf_tex}
%% instead of
%% \includegraphics[width=<desired width>]{<filename>.pdf}
%%
%% Images with a different path to the parent latex file can
%% be accessed with the `import' package (which may need to be
%% installed) using
%% \usepackage{import}
%% in the preamble, and then including the image with
%% \import{<path to file>}{<filename>.pdf_tex}
%% Alternatively, one can specify
%% \graphicspath{{<path to file>/}}
%%
%% For more information, please see info/svg-inkscape on CTAN:
%% http://tug.ctan.org/tex-archive/info/svg-inkscape
%%
\begingroup%
\makeatletter%
\providecommand\color[2][]{%
\errmessage{(Inkscape) Color is used for the text in Inkscape, but the package 'color.sty' is not loaded}%
\renewcommand\color[2][]{}%
}%
\providecommand\transparent[1]{%
\errmessage{(Inkscape) Transparency is used (non-zero) for the text in Inkscape, but the package 'transparent.sty' is not loaded}%
\renewcommand\transparent[1]{}%
}%
\providecommand\rotatebox[2]{#2}%
\newcommand*\fsize{\dimexpr\f@size pt\relax}%
\newcommand*\lineheight[1]{\fontsize{\fsize}{#1\fsize}\selectfont}%
\ifx\svgwidth\undefined%
\setlength{\unitlength}{508.89453173bp}%
\ifx\svgscale\undefined%
\relax%
\else%
\setlength{\unitlength}{\unitlength * \real{\svgscale}}%
\fi%
\else%
\setlength{\unitlength}{\svgwidth}%
\fi%
\global\let\svgwidth\undefined%
\global\let\svgscale\undefined%
\makeatother%
\begin{picture}(1,1.01924366)%
\lineheight{1}%
\setlength\tabcolsep{0pt}%
\put(0,0){\includegraphics[width=\unitlength,page=1]{HL-Diagram-CNN-Simplified.pdf}}%
\end{picture}%
\endgroup%

File diff suppressed because it is too large Load diff

After

Width:  |  Height:  |  Size: 112 KiB

BIN
docs/report/img/HL-Diagram-CNN.pdf (Stored with Git LFS) Normal file

Binary file not shown.

View file

@ -0,0 +1,58 @@
%% Creator: Inkscape 1.1.2 (0a00cf5339, 2022-02-04, custom), www.inkscape.org
%% PDF/EPS/PS + LaTeX output extension by Johan Engelen, 2010
%% Accompanies image file 'HL-Diagram-CNN.pdf' (pdf, eps, ps)
%%
%% To include the image in your LaTeX document, write
%% \input{<filename>.pdf_tex}
%% instead of
%% \includegraphics{<filename>.pdf}
%% To scale the image, write
%% \def\svgwidth{<desired width>}
%% \input{<filename>.pdf_tex}
%% instead of
%% \includegraphics[width=<desired width>]{<filename>.pdf}
%%
%% Images with a different path to the parent latex file can
%% be accessed with the `import' package (which may need to be
%% installed) using
%% \usepackage{import}
%% in the preamble, and then including the image with
%% \import{<path to file>}{<filename>.pdf_tex}
%% Alternatively, one can specify
%% \graphicspath{{<path to file>/}}
%%
%% For more information, please see info/svg-inkscape on CTAN:
%% http://tug.ctan.org/tex-archive/info/svg-inkscape
%%
\begingroup%
\makeatletter%
\providecommand\color[2][]{%
\errmessage{(Inkscape) Color is used for the text in Inkscape, but the package 'color.sty' is not loaded}%
\renewcommand\color[2][]{}%
}%
\providecommand\transparent[1]{%
\errmessage{(Inkscape) Transparency is used (non-zero) for the text in Inkscape, but the package 'transparent.sty' is not loaded}%
\renewcommand\transparent[1]{}%
}%
\providecommand\rotatebox[2]{#2}%
\newcommand*\fsize{\dimexpr\f@size pt\relax}%
\newcommand*\lineheight[1]{\fontsize{\fsize}{#1\fsize}\selectfont}%
\ifx\svgwidth\undefined%
\setlength{\unitlength}{932.878906bp}%
\ifx\svgscale\undefined%
\relax%
\else%
\setlength{\unitlength}{\unitlength * \real{\svgscale}}%
\fi%
\else%
\setlength{\unitlength}{\svgwidth}%
\fi%
\global\let\svgwidth\undefined%
\global\let\svgscale\undefined%
\makeatother%
\begin{picture}(1,1.83062345)%
\lineheight{1}%
\setlength\tabcolsep{0pt}%
\put(0,0){\includegraphics[width=\unitlength,page=1]{HL-Diagram-CNN.pdf}}%
\end{picture}%
\endgroup%

File diff suppressed because it is too large Load diff

After

Width:  |  Height:  |  Size: 272 KiB

BIN
docs/report/img/aadb.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/aqa-on-ava.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/archillect.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/calculate_image_rating.pdf (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/cnn-testing.png (Stored with Git LFS) Normal file

Binary file not shown.

View file

View file

@ -0,0 +1 @@
<mxfile host="app.diagrams.net" modified="2022-05-04T18:47:51.528Z" agent="5.0 (X11)" etag="9YXM8JNOyMK4nvM1R8eU" version="17.5.0" type="device"><diagram id="C5RBs43oDa-KdzZeNtuy" name="Page-1">7ZpZc6M4EIB/DVW7qcqUAQP2o+Mcmx1nrsxOZvZlSwYZ2BWIEXJs5tevhCTOxCHOgZ3yg8uokYTo7k/qltDMabS+ICAJrrAHkWYMvLVmnmqGYQysEfvjkkxInJEjBD4JPSHSS8F1+AtK4UBKl6EH01pFijGiYVIXujiOoUtrMkAIXtWrLTCqPzUBPmwJrl2A2tKb0KOBkI4Mp5T/AUM/UE/W7bG4EwFVWb5JGgAPryoi80wzpwRjKq6i9RQirjyll5vL7AbN/rMv/vyc/gR/nbz/+uHbsejs/DFNilcgMKZbd/0z/Zz8fbX65/o2y7L3X39Nrz6ax4Z8NZopfUGPqU8WMaEB9nEM0FkpPSF4GXuQ9zpgpbLODOOECXUm/BdSmklfAEuKmSigEZJ32VuQ7Dtv/85SxR+yu7xwuq6VMlkSY+UDbJj2Ab3IeileEhduUIa0MwXEh3RDPb0wPqMG4giyQbJ2BCJAw9v64IB0X7+oV5qIXUgrPcJicpC3AC3lky7jZMmGO8lVQYO0ZdLSYFz7qyCk8DoBuSpWDPq6cWT3kFC43kLNbbXIXixHjlzOI7YsrkoodUVaUAFS1Xt2RQ4Prt9w6Q6ub/bp+nrL9WcYeLxVxBeBpjnZfJ3wS6YkgBBE2CcgYrpMIAnZgCBp3vtU3niIkkW4hmqxe01q9GHf2NgHbBo4dMDG6hMbs4XNF5hyTRfgsIiMP5j9EF4x9+cjSzFa0hDH+7GaGGbfXIwOXDT8vQMXTp9cWC0upph7aoULFyP2qvyNcw+XoMyOwNH8aD/AMI2+wVCJ4IGMwuM7kDHukwynRcbZmhLgcjRwjPhAaMBxmLHfb2gZhTGIXfg7JyYALI9HuwmHYzbg6J2NQ/7d9Pku+XevCfi4BccHTCKAREQluCiQ2Dci9P6RsA5INF29CxPDPpnQ26n5BaQFDoA5rgioRAW+jUv2Hxaj/9iq52y8hkpJTl+wDDvCUtm+3w4W2fQTDmNa8Y9x3T+Go4bdxcBkq4bpi2E8wRvacdsP2N4O5gaYgTmjrGZVtoL5Mbt2mTHyLTAOUOgCNJE3otDzhAPx7QMwz/vjdk34C+WvaJ1o1ukmAuUpj2ysFQRVfWCDp9+L5vHgnQyltrWhqoIXixS+iHmMwxbBVrD2urKpKb8W7r0NpIQ/bkLKtG2nNqMp6+4uZMqtKvbSDBsxPZx44S279PnlZXpHbKLqzYmqpiRsJJXGd/SnwhqfQEChCG1A/LQ+xfBowFwj4Ife97RreKI69AjgGrCJonHcIaXFQYexOycdrdxc7xhcNRfZZ0PfGPc6XxeF3QiulOpffr7Om04IAVmlgpwQ7429xnYj9nIGDQcQPT7vit5eHCaep4lzeE3uFqMwzffNFjztCRGjDqoqO3pUPx7s2h6ycchzWoS9wocqW5FomXdnQecd6+uvQu4bTpWMB1Mle6zWtr2J68xedsp3hmSz1y1vs/0JwVtJgoRfbdxXcGyrPkE9jZV1HblX2Hdop0QTIs8q8v+UhnkvEc6LMjIR0cs8lxDswjTlJJ23zP5mMo5h17OPF8s4zPY2+h2J4cclLT74rASXHdNK9g8irvF4nia5tpsh6aPyzH3+uK7pAJZlvVicy4rll+KC7PJ7e/Psfw==</diagram></mxfile>

BIN
docs/report/img/diagrams/filter_brightness.pdf (Stored with Git LFS) Normal file

Binary file not shown.

View file

@ -0,0 +1,16 @@
@startuml
start
:Input: list of paths;
if (Does the list only have one item which is also a directory?) then (yes)
:List files inside the directory;
:Overwrite the original list with those new files;
else (no)
endif
if (Ignore video files?) then (yes)
:Add each file that ends in an image file extension to a new list;
else (no)
: Add each file that ends in an image or a video file extension to a new list;
endif
:Output: list of filtered paths;
stop
@enduml

BIN
docs/report/img/diagrams/filter_paths.pdf (Stored with Git LFS) Normal file

Binary file not shown.

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 9.2 KiB

View file

@ -0,0 +1,16 @@
@startuml
start
:Input: list of paths;
if (Does the list only have one item which is also a directory?) then (yes)
:List files inside the directory;
:Overwrite the original list with those new files;
else (no)
endif
if (Ignore video files?) then (yes)
:Add each file that ends in an image file extension to a new list;
else (no)
: Add each file that ends in an image or a video file extension to a new list;
endif
:Output: list of filtered paths;
stop
@enduml

BIN
docs/report/img/diagrams/load_config.pdf (Stored with Git LFS) Normal file

Binary file not shown.

File diff suppressed because one or more lines are too long

After

Width:  |  Height:  |  Size: 6.8 KiB

View file

@ -0,0 +1,17 @@
@startuml
start
:Input: path to a config file;
if (Does the path exist?) then (yes)
if (Does the path end with .yml or .yaml?) then (yes)
:Attempt to read the file;
:Output: dictionary of file contents;
stop
else (no)
#pink:Error: File is not a YAML file;
kill
endif
else (no)
#pink:Error: File doesn't exist;
kill
endif
@enduml

View file

View file

View file

@ -0,0 +1,9 @@
dir=$(cd -P -- "$(dirname -- "$0")" && pwd -P)
echo $dir
for path in $dir/*.uml; do
echo $path
filename=$(basename -- "$path")
filename_noext="${filename%.*}"
cat "$dir/$filename" | docker run --rm -i think/plantuml > $filename_noext.svg
done
inkscape -C $dir/*.svg --export-type=pdf

BIN
docs/report/img/dof.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/experiments/exp-3-overcastjuly-melindwr1.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/experiments/exp-3-overcastjuly-melindwr2.png (Stored with Git LFS) Normal file

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

Binary file not shown.

BIN
docs/report/img/experiments/exp-3-sunnyaugust-melindwr1.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/experiments/exp-3-sunnyaugust-melindwr2.png (Stored with Git LFS) Normal file

Binary file not shown.

Binary file not shown.

BIN
docs/report/img/experiments/experiment-1.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/experiments/experiment-2.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/experiments/experiment-4.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/gitea.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/gitlab.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/golden-ratio.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/kanban.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/milestones.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/plot-20-epochs.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/plot-2000-epochs.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/rule-of-thirds.jpg (Stored with Git LFS) Normal file

Binary file not shown.

View file

@ -0,0 +1,4 @@
dir=$(cd -P -- "$(dirname -- "$0")" && pwd -P)
inkscape -D $dir/*.svg --export-type=pdf --export-latex

BIN
docs/report/img/symmetry.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/terminal-loss.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/trophy-camera-image.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/trophy-camera.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/woodpecker-pipeline.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/woodpecker-pipelines.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/img/wpp.jpg (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/osp1-mmp-report-hannah-draft.pdf (Stored with Git LFS) Normal file

Binary file not shown.

BIN
docs/report/osp1-mmp-report.pdf (Stored with Git LFS) Normal file

Binary file not shown.

212
docs/report/osp1-mmp-report.tex Executable file
View file

@ -0,0 +1,212 @@
\documentclass[11pt,a4paper]{report}
% Aberstwyth MMP Project Report Template for LaTeX
%
% Authors: Neil Taylor (nst@aber.ac.uk) and Dr. Hannah Dee (hmd1@aber.ac.uk)
%
% This has been adapted from the Leeds Thesis template and the
% Group Project template for Computer Science in Aberystywth University.
%
% All comments and suggestions welcome.
%
% Template designed to be used with pdflatex: it may need alteration to
% run with a different LaTeX engine.
%
% Note - this is offered as a starting point for your work. You are not
% required to use this template and can choose to create your own document
% without it.
% This template is suitable for students with an engineering-style project,
% which will be most students in the department. If your project is a research-oriented
% project, look at the alternative template.
% To build document on the unix command line, run four commands:
% pdflatex mmp-report
% bibtex mmp-report
% pdflatex mmp-report
% pdflatex mmp-report
% you will end up with mmp-report.pdf. Before submitting, add your user ID as a prefix,
% e.g. abc01-mmp-report.pdf
\usepackage{StylesAndReferences/mmp-report}
% the following packages are used for citations - You only need to include one.
%
% Use the cite package if you are using the numeric style (e.g. IEEEannot).
% Use the natbib package if you are using the author-date style (e.g. authordate2annot).
% Only use one of these and comment out or remove the other one.
\usepackage{cite}
%\usepackage{natbib}
% Use en-GB locale formatting for dates
\usepackage[en-GB]{datetime2}
\DTMlangsetup[en-GB]{ord=raise}
% plantuml
\usepackage{plantuml}
% load images path
\graphicspath{ {./img} }
%%%% Title and Section Colours %%%%
% Modify these values to change the colours used for title, sections and subsections.
% Each value is a range of 0-255 in RGB colourspace.
% Idea courtesy of discussion at
% https://www.overleaf.com/learn/latex/Using_colours_in_LaTeX
% and
% https://tex.stackexchange.com/questions/75667/change-colour-on-chapter-section-headings-lyx
%
% If you prefer to have black headers, then comment out the following lines
\definecolor{mmpTitle}{RGB}{10, 85, 145}
\definecolor{mmpSection}{RGB}{10,85,155}
\definecolor{mmpSubsection}{RGB}{79,129,189}
\chapterfont{\color{mmpTitle}} % sets colour of chapters
\sectionfont{\color{mmpSection}} % sets colour of sections
\subsectionfont{\color{mmpSubsection}} % sets colour subsections
\subsubsectionfont{\color{mmpSubsection}} % sets colour subsections
%%%% end of Title and Section Colours %%%%
%%%% Report Type %%%%
%% comment/uncomment depending on the type of report you want to generate
%\reporttype{Engineering}
\reporttype{Research}
%%%% end of Report Type %%%%
\begin{document}
%TC:ignore
% all of the include directives below refer to tex files
% so \include{cover} includes cover.tex - to change the content,
% edit the tex file
\raggedright
\pagenumbering{roman}
% This is the front page
\include{FrontMatter/cover}
% Set up page numbering
\pagestyle{empty}
% declarations of originality
\include{FrontMatter/declaration}
\include{FrontMatter/acknowledgements} % Acknowledgements
\include{FrontMatter/abstract} % Abstract
\pagenumbering{roman}
\pagestyle{fancy}
\fancyhead{}
\fancyfoot[C]{\thepage}
\renewcommand{\headrulewidth}{0 pt}
\renewcommand{\chaptermark}[1]{\markboth{#1}{}}
\tableofcontents
\newpage
\listoffigures % comment out this line if you don't have any figures / graphics
\newpage
\listoftables % comment out this line if you don't have any tables
\newpage
% Set up page numbering
\pagenumbering{arabic}
\setchapterheaderfooter
%TC:endignore
% Problem Analysis 15%
% Technical Work 35%
% Critical Evaluation and Insight 10%
% Dissertation Presentation Quality 10%
% Word count break down:
% Problem Analysis 25% 2727 words
% Technical Work 58.3..% 6400 words
% Critical Evaluation and Insight 16.6..% 1818 words
% include the chapters
\include{Chapters_\showreporttype/background-and-objectives}
\include{Chapters_\showreporttype/experiment-methods}
\include{Chapters_\showreporttype/software-design-implementation-testing}
\include{Chapters_\showreporttype/results-and-conclusions}
\include{Chapters_\showreporttype/evaluation}
% add any additional chapters here
%TC:ignore
\setemptyheader
\nocite{*} % include everything from the bibliography, irrespective of whether it has been referenced.
% the following line is included so that the bibliography is also shown in the table of contents. There is the possibility that this is added to the previous page for the bibliography. To address this, a newline is added so that it appears on the first page for the bibliography.
\addcontentsline{toc}{chapter}{References} % Adds References to contents page
%
% example of including an bibliography. The current style uses IEEE. If you want to change, comment out the line and uncomment the previous line. You should also modify the packages included at the top (see the notes earlier in the file) and then trash your aux files and re-run.
%\bibliographystyle{StylesAndReferences/authordate2annot}
\bibliographystyle{StylesAndReferences/IEEEannotU}
\renewcommand{\bibname}{References}
\bibliography{StylesAndReferences/references} % References file
\setemptyheader
\addcontentsline{toc}{chapter}{Appendices}
\chapter*{Appendices}
%The appendices are for additional content that is useful to support the discussion in the report. It is material that is not necessarily needed in the body of the report, but its inclusion in the appendices makes it easy to access.
%
%If you have used any 3rd party code, i.e. code that you have not written yourself such as libraries, then you must include Appendix A. In that appendix, you will provide details of the 3rd party code that you have used.
%
%For most other items, it would be better to include them in your technical submission instead of including them as an appendix. For example:
%
%\begin{itemize}
% \item If you have developed a Design Specification document as part of a plan-driven approach for the project, then it would be appropriate to include that document in the technical work. In this report, you would highlight the most interesting aspects of the design, referring your reader to the full specification for further detail.
% \item If you have taken an agile approach to developing the project, then you may be less likely to have developed a full requirements specification at the start of the project. Perhaps you used stories to keep track of the functionality and the future conversations. If it isnt relevant to include all those stories in the body of your report, you could detail those stores in a document in the technical work.
% \item If you have used manual testing, then include a document in the technical work that records the tests that have been done. In this report, you would talk about the use of those tests.
%\end{itemize}
%
%Documents included in the technical work or in the appendices are supporting evidence of the work done. Where you include documents, this report should refer to the documents. You should not be relying on detailed study of those documents in order to understand what is written in this report.
%
%Speak to your supervisor or the module coordinator if you have questions about this.
%\pagebreak
% start the appendix - sets up different numbering
\fancypagestyle{plain}{%
%\fancyhf{} % clear all header and footer fields
\fancyhead[L]{Appendix\ \thechapter}
\fancyhead[R]{\leftmark}}
\appendix
\fancyhead[L]{Appendix\ \thechapter}
\fancyhead[R]{\leftmark}
\fancyhead[C]{}
\fancyfoot[C]{\thepage}
\renewcommand{\headrulewidth}{0.4pt}
\renewcommand{\chaptermark}[1]{\markboth{#1}{}}
\fancyhead[L]{Appendix\ \thechapter}
\fancyhead[R]{\leftmark}
\fancyfoot[C]{{\thepage} of \pageref{LastPage}}
% include any appendices here
\include{Appendices/appendixA}
\include{Appendices/appendixB}
\fancypagestyle{plain}{%
\fancyhead{} %[C]{Annotated Bibliography}
\fancyfoot[C]{{\thepage} of \pageref{LastPage}} % except the center
\renewcommand{\headrulewidth}{0pt}
\renewcommand{\footrulewidth}{0pt}
}
%TC:endignore
\end{document}

View file

@ -1,5 +1,9 @@
name: autophotographer
dependencies:
- numpy
- pandas
- python=3.8
- opencv-python=4.5.5.64
- matplotlib=3.5.1
- pandas=1.4.2
- PyYAML=6.0
- scikit-image=0.19.2
- tqdm=4.64.0

View file

@ -11,6 +11,8 @@ from skimage.exposure import is_low_contrast
from tqdm import tqdm
import autophotographer.filters.brightness.brightness as brightness
import autophotographer.filters.focusdetection.focusdetection as focusdetection
from autophotographer.cnn.predict import predict
import sys
# GLOBAL VARIABLES
# accepted image formats
@ -20,22 +22,18 @@ image_formats = (".jpg", ".jpeg", ".png")
video_formats = (".mp4", ".mov", ".avi", ".flv", ".mkv")
# default options
filters = ['filesize', 'brightness', 'contrast', 'focus']
modelPath = os.path.abspath(os.path.join(os.path.dirname(__file__), "../", "output/22-03-2022/model.pth"))
brightness_thresh = 0.35
focus_thresh = 0.35
filesize_thresh = 0.35
contrast_thresh = 0.35
ignore_video = False
cnnRank = False
# FUNCTIONS
# load config file
def load_config(path=os.path.join(os.path.dirname(__file__), "./config.yml")):
"""
The load_config function loads a YAML configuration file from the specified path.
:param path=os.path.join(os.path.dirname(__file__): Used to Specify the path of the config.
:param "./config.yml"): Used to Specify the path to the config file.
:return: A dictionary of the config file.
"""
abs_path = os.path.abspath(path)
# check if file exists
@ -52,25 +50,7 @@ def load_config(path=os.path.join(os.path.dirname(__file__), "./config.yml")):
raise FileNotFoundError("[ERRO] Path does not exist")
# load the correct filter function from filter name
def filter_to_function(imagefilter: str, paths: list) -> list:
"""
The filter_to_function function filters a list of image paths based on the filter type.
Args:
imagefilter (str): The type of filter to apply. Can be "brightness", "contrast", or "focus".
paths (list): A list of filepaths to images that will be filtered.
brightness_thresh (int, optional): The threshold for filtering out images with low brightness levels. Defaults to 100 if not specified by user in config file or command line arguments.
contrast_thresh (float, optional): The threshold for filtering out images with low contrast levels. Defaults to 1 if not specified by user in config file or command line arguments..
focus_thresh (float, optional): The threshold for filtering out blurry images based on the variance in blurriness across all pixels within an image's bounding box as determined using OpenCV's Laplacian method and Gaussian Blur method respectively . Defaults to 0 if not specified by user in config file or command line arguments..
:param imagefilter:str: Used to Determine which filter to apply.
:param paths:list: Used to Store the paths of all images that have been filtered.
:return: The filtered list of paths.
"""
def filter_to_function(imageFilter: str, paths: list) -> list:
if imageFilter == "brightness":
paths = filter_brightness(paths, brightness_thresh)
print("[INFO] Filtering based on brightness...")
@ -306,42 +286,16 @@ def display_images(paths, location):
plt.savefig(location)
# parse command line arguments
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-i", "--input", type=os.path.abspath, required=True, nargs="+",
help="path to video or image folder")
parser.add_argument("-c", "--config", type=os.path.abspath, help="path to config file")
args = vars(parser.parse_args())
# load in config file
if args["config"] is not None:
autophotoConf = load_config(args["config"])
else:
autophotoConf = load_config()
# Load values for options
if autophotoConf["brightness_options"]["threshold"] is not None:
brightness_thresh = autophotoConf["brightness_options"]["threshold"]
if autophotoConf["focus_options"]["threshold"] is not None:
focus_thresh = autophotoConf["focus_options"]["threshold"]
if autophotoConf["contrast_options"]["threshold"] is not None:
contrast_thresh = autophotoConf["contrast_options"]["threshold"]
if autophotoConf["filesize_options"]["threshold"] is not None:
filesize_thresh = autophotoConf["filesize_options"]["threshold"]
if autophotoConf["ignore_video"] is not None:
ignore_video = autophotoConf["ignore_video"]
paths = filter_paths(args["input"])
def main(paths, filters=None):
print("[INFO] Loaded {} objects.".format(len(paths)))
prior_paths = []
path_diff = []
# Order and selection of operations from config file
if autophotoConf["filters"] is not None:
if filters is not None:
# iterate over all chosen filters
for imageFilter in autophotoConf["filters"]:
for imageFilter in filters:
prior_paths = paths
# run given filter
@ -359,5 +313,41 @@ if __name__ == "__main__":
print("[INFO] Filtered {}/{} images via {} filtering.".format(
len(paths), len(prior_paths), imageFilter))
if autophotoConf["CNNrank"]:
if cnnRank:
print("[INFO] Running CNN ranking...")
predictions = predict(paths, modelPath)
return predictions
return paths
if __name__ == "__main__":
parser = argparse.ArgumentParser()
parser.add_argument("-i", "--input", type=os.path.abspath, required=True, nargs="+",
help="path to video or image folder")
parser.add_argument("-c", "--config", type=os.path.abspath, help="path to config file")
args = vars(parser.parse_args())
# load in config file
if args["config"] is not None:
autophotoConf = load_config(args["config"])
else:
autophotoConf = load_config()
# Load values for options
if autophotoConf["filters"] is not None:
filters = autophotoConf["filters"]
if autophotoConf["brightness_options"]["threshold"] is not None:
brightness_thresh = autophotoConf["brightness_options"]["threshold"]
if autophotoConf["focus_options"]["threshold"] is not None:
focus_thresh = autophotoConf["focus_options"]["threshold"]
if autophotoConf["contrast_options"]["threshold"] is not None:
contrast_thresh = autophotoConf["contrast_options"]["threshold"]
if autophotoConf["filesize_options"]["threshold"] is not None:
filesize_thresh = autophotoConf["filesize_options"]["threshold"]
if autophotoConf["ignore_video"] is not None:
ignore_video = autophotoConf["ignore_video"]
if autophotoConf["CNNrank"] is not None:
cnnRank = autophotoConf["CNNrank"]
paths = filter_paths(args["input"])
sys.exit(main(paths, filters))

View file

@ -1,58 +0,0 @@
import argparse
import os
import pathlib
import sys
import cv2
# Process arguments
def parse_arguments(argv=None):
parser = argparse.ArgumentParser()
parser.add_argument('-i', '--input', dest='inputfile', type=pathlib.Path, help='Specify a video file')
parser.add_argument('-o', '--output', dest='outputfolder', type=pathlib.Path, help='Specify a folder to save frames to')
return parser.parse_args()
# Convert video to frames
def video_to_frames():
capture = cv2.VideoCapture(str(inputfile))
success,image = capture.read()
count = 0
while success:
outputfile = outputfolder + "/frame%d.jpg" % count
print(outputfile)
cv2.imwrite(outputfile, image)
success,image = capture.read()
count +=1
# Shrink set based on filesize
def display_file_sizes():
filesizes = []
for filename in os.listdir(outputfolder):
filepath = outputfolder + "/" + filename
filesize = os.path.getsize(filepath)
print(filepath + ": " + str(filesize))
filesizes.append(filesize)
# work out average
average = sum(filesizes)/len(filesizes)
print ("Average is: " + str(average))
# delete files below average
count = 0
for filename in os.listdir(outputfolder):
filepath = outputfolder + "/" + filename
if filesizes[count] < average:
# print(filepath + ": " + str(filesizes[count]))
os.remove(filepath)
count += 1
#def remove_similar_frames():
# sad
args = parse_arguments()
inputfile = str(args.inputfile.absolute())
outputfolder = str(args.outputfolder.absolute())
# Convert video to frames
video_to_frames()
display_file_sizes()

View file

@ -1,5 +1,4 @@
import os
import torch
# https://pytorch.org/hub/pytorch_vision_resnet/
@ -10,15 +9,12 @@ IMAGE_SIZE = 224
VAL_SPLIT = 0.1
DEVICE = "cuda" if torch.cuda.is_available() else "cpu"
FEATURE_EXTRACTION_BATCH_SIZE = 256
PRED_BATCH_SIZE = 4
EPOCHS = 20
TRAIN_BATCH_SIZE = 256
PREDICTION_BATCH_SIZE = 4
EPOCHS = 1
LR = 0.001
IMAGE_SIZE = 32
PRED_BATCH_SIZE = 4
WARMUP_PLOT = os.path.join("output", "plot.png")
WARMUP_MODEL = os.path.join("output", "plot.pth")
TENSOR_IMAGES_PATH = "tensorImages.pt"
TENSOR_RATINGS_PATH = "tensorRatings.pt"

View file

@ -16,7 +16,7 @@ from sklearn.preprocessing import MinMaxScaler
from torch.utils.data import DataLoader
from torchvision import transforms
from . import config
import autophotographer.cnn.config as config
ImageFile.LOAD_TRUNCATED_IMAGES = True
@ -85,6 +85,10 @@ def get_random_image_index():
def calculate_image_rating(imageIndex):
df = pd.read_csv(filePathInfoAVA, sep=" ", header=None)
imageInfo = df.loc[df[1] == imageIndex]
if len(imageInfo.values[0]) > 15:
raise ValueError("Entry is larger than the expected 15 columns.")
elif len(imageInfo.values[0]) < 15:
raise ValueError("Entry is smaller than the expected 15 columns.")
styleIndex = 2
numOfRatings = 0
score = 0
@ -96,9 +100,11 @@ def calculate_image_rating(imageIndex):
styleIndex += 1
#print(numOfRatings)
#print(score)
if score == 0:
return 0
adjustedScore = score / numOfRatings
#print(adjustedScore)
return adjustedScore
return round(adjustedScore, 6)
def get_all_image_ratings():
imageRatings = {}
@ -119,7 +125,7 @@ def get_all_image_ratings():
df = pd.DataFrame(list(imageRatings.items()))
return df
def load_image_ratings():
def load_image_ratings(filePathRatings):
columns = ["id", "rating"]
df = pd.read_csv(filePathRatings, header=None, sep=" ", names=columns)
return df
@ -157,11 +163,10 @@ def build_dataframe(df, imgPath):
return df
#df = build_dataframe(remove_entries_for_missing_images(load_image_ratings(), imgPath), imgPath)
df = pd.read_csv(dataframePath, index_col = 0)
def create_tensor_array():
tensorArray = []
for idx, row in df.iterrows():
for index, row in df.iterrows():
rating = row['rating']
image = cv2.imread(row['path'])
image = cv2.resize(image, (32, 32))
@ -181,7 +186,7 @@ def load_in_tensors():
def get_dataloader(df, transforms, batchSize, shuffle=True):
# Create a dataloader for dataset
tensorArray = []
for idx, row in df.iterrows():
for index, row in df.iterrows():
# load each image path and process the data via transforms
transformedImg = transforms(pil_loader(row['path']))
rating = row['rating']
@ -195,7 +200,27 @@ def get_dataloader(df, transforms, batchSize, shuffle=True):
return (tensorArray, loader)
# https://github.com/python-pillow/Pillow/issues/835#issuecomment-53999355
def pil_loader(path):
with open(path, 'rb') as f:
img = Image.open(f)
return img.convert('RGB')
image = Image.open(f)
return image.convert('RGB')
def load_ratings_df(pathToRatingsFile):
columns = ["id", "rating"]
df = pd.read_csv(pathToRatingsFile, header=None, sep=" ", names=columns)
return df
def load_images_df(pathToImagesDir, image_formats):
imagePaths = []
imageIds = []
for filename in os.listdir(pathToImagesDir):
path = os.path.join(pathToImagesDir, filename)
if path.lower().endswith(image_formats):
imagePaths.append(path)
imageIds.append(int(os.path.splitext(filename)[0]))
df = pd.DataFrame({"id": imageIds, "path": imagePaths})
return df
# Read in the dataframe
df = pd.read_csv(dataframePath, index_col = 0)

View file

@ -3,7 +3,6 @@ import time
from os.path import abspath
import config
import dataset
import matplotlib.pyplot as plt
import torch
import torch.nn as nn
@ -12,31 +11,35 @@ from torch.utils.data import DataLoader
from torchvision import transforms
from torchvision.models import resnet50
from tqdm import tqdm
import autophotographer.cnn.dataset as dataset
import autophotographer.cnn.config as config
# projectRoot = "/src/"
script_directory = os.path.dirname(__file__)
projectRoot = abspath(os.path.join(script_directory, "../.."))
projectRoot = abspath(os.path.join(script_directory, "../../.."))
print("Project root: " + projectRoot)
INITIAL_PLOT_PATH = os.path.join(projectRoot, "src/output/plot.png")
INTIIAL_MODEL_PATH = os.path.join(projectRoot, "src/output/model.pth")
valDatasetPath = os.path.join(projectRoot, "src/autophotographer/valDataset.pt")
trainDatasetPath = os.path.join(projectRoot, "src/autophotographer/trainDataset.pt")
# Declare transformations
# Declare transformations for train set
trainTransform = transforms.Compose([
transforms.RandomResizedCrop(config.IMAGE_SIZE),
transforms.RandomHorizontalFlip(),
transforms.RandomResizedCrop(config.IMAGE_SIZE),
transforms.RandomRotation(90),
transforms.ToTensor(),
transforms.Normalize(mean=config.MEAN, std=config.STD)
])
# Declare transformations for val set
valTransform = transforms.Compose([
transforms.Resize((config.IMAGE_SIZE, config.IMAGE_SIZE)),
transforms.ToTensor(),
transforms.Normalize(mean=config.MEAN, std=config.STD)
])
# Split dataset into val and train
# Load dataset from dataset module and split dataset into val and train
valSetLen = int(len(dataset.df) * config.VAL_SPLIT)
trainSetLen = len(dataset.df) - valSetLen
trainSet = dataset.df[:trainSetLen]
@ -46,20 +49,20 @@ print("Using " + config.DEVICE + "...")
# create data loaders
print("Getting dataloaders...")
#(trainDataset, trainLoader) = dataset.get_dataloader(trainSet,
#transforms=trainTransform, batchSize=config.FEATURE_EXTRACTION_BATCH_SIZE)
(trainDataset, trainLoader) = dataset.get_dataloader(trainSet,
transforms=trainTransform, batchSize=config.TRAIN_BATCH_SIZE)
#torch.save(trainDataset, 'trainDataset.pt')
#(valDataset, valLoader) = dataset.get_dataloader(valSet,
#transforms=valTransform, batchSize=config.FEATURE_EXTRACTION_BATCH_SIZE, shuffle=False)
(valDataset, valLoader) = dataset.get_dataloader(valSet,
transforms=valTransform, batchSize=config.TRAIN_BATCH_SIZE, shuffle=False)
#torch.save(valDataset, 'valDataset.pt')
# Load datasets tensors from disk
valDataset = torch.load(valDatasetPath)
valLoader = DataLoader(valDataset, batch_size=config.FEATURE_EXTRACTION_BATCH_SIZE, shuffle=False, num_workers=os.cpu_count(),
pin_memory=True if config.DEVICE == "cuda" else False)
trainDataset = torch.load(trainDatasetPath)
trainLoader = DataLoader(trainDataset, batch_size=config.FEATURE_EXTRACTION_BATCH_SIZE, shuffle=True, num_workers=os.cpu_count(),
pin_memory=True if config.DEVICE == "cuda" else False)
#valDataset = torch.load(valDatasetPath)
#valLoader = DataLoader(valDataset, batch_size=config.FEATURE_EXTRACTION_BATCH_SIZE, shuffle=False, num_workers=os.cpu_count(),
# pin_memory=True if config.DEVICE == "cuda" else False)
#trainDataset = torch.load(trainDatasetPath)
#trainLoader = DataLoader(trainDataset, batch_size=config.FEATURE_EXTRACTION_BATCH_SIZE, shuffle=True, num_workers=os.cpu_count(),
# pin_memory=True if config.DEVICE == "cuda" else False)
# Load the resnet model
model = resnet50(pretrained=True)
@ -71,30 +74,32 @@ for parameter in model.parameters():
# Replace last layer with a FC single output layer
modelOutputFeatures = model.fc.in_features
model.fc = nn.Linear(modelOutputFeatures, 1)
# Send model to device
model = model.to(config.DEVICE)
# Initialize otimizer and loss function
optimizer = torch.optim.Adam(model.fc.parameters(), lr=config.LR)
lossFunction = nn.L1Loss() # mean absolute error
# Calculate steps required for train and validation
trainSteps = len(trainDataset) // config.TRAIN_BATCH_SIZE
valSteps = len(valDataset) // config.TRAIN_BATCH_SIZE
# Calculate steps for train and validation
trainSteps = len(trainDataset) // config.FEATURE_EXTRACTION_BATCH_SIZE
valSteps = len(valDataset) // config.FEATURE_EXTRACTION_BATCH_SIZE
# Store training data
# Store training data to print to output
dataDict = {"train_loss": [], "val_loss": []}
# Loop over epochs
print("Training...")
startTime = time.time()
for epoch in tqdm(range(config.EPOCHS)):
# put model in training mode
model.train()
totalTrainLoss = 0
totalValLoss = 0
# Loop over training data
for (i, (x, y)) in enumerate(trainLoader):
(x, y) = (x.to(config.DEVICE), y.to(config.DEVICE))
pred = model(x)
@ -110,6 +115,7 @@ for epoch in tqdm(range(config.EPOCHS)):
totalTrainLoss += loss
# Evaluate performance with validation set
with torch.no_grad():
model.eval()

View file

@ -0,0 +1,49 @@
import argparse
import datetime
import os
from os.path import abspath
import matplotlib.pyplot as plt
import torch
from torch import nn
from torchvision import transforms
from PIL import Image
from tqdm import tqdm
def predict(paths, model):
device = torch.device("cuda" if torch.cuda.is_available() else "cpu")
model = torch.load(model, map_location=torch.device('cpu'))
model = model.to(device)
model.eval()
# https://pytorch.org/hub/pytorch_vision_resnet/
configImageSize = 224
configMean = [0.485, 0.456, 0.406]
configStd = [0.229, 0.224, 0.225]
# Declare transforms
imageTransforms = transforms.Compose([
transforms.Resize((configImageSize, configImageSize)),
transforms.ToTensor(),
transforms.Normalize(mean=configMean, std=configStd)
])
# # Calulate the inverse std and inverse mean for denormalisation
# invMean = [-m/s for (m, s) in zip(configMean, configStd)]
# invStd = [1/s for s in configStd]
# # define denormalisation transform
# deNormalize = transforms.Normalize(mean=invMean, std=invStd)
predictions = {}
for path in tqdm(paths, desc="Predicting aesthetic quality"):
img = Image.open(path)
img = img.convert("RGB")
trImg = imageTransforms(img)
trImg = trImg.unsqueeze(0)
trImg = trImg.to(device)
with torch.no_grad():
# Make a prediction on the images
imagePrediction = model(trImg)
imagePrediction = imagePrediction[0].numpy()[0]
predictions[path] = imagePrediction
return predictions

View file

@ -12,9 +12,9 @@ from cnn import config, dataset
# set project root for fetching files using relative file paths
script_directory = os.path.dirname(__file__)
projectRoot = abspath(os.path.join(script_directory, "../"))
projectRoot = abspath(os.path.join(script_directory, "../../"))
print(projectRoot)
PLOT_PATH = os.path.join(projectRoot, "src/output/")
plotPath = os.path.join(projectRoot, "src/output/")
# parse arguments for image to predict and model to use
parser = argparse.ArgumentParser()
@ -48,7 +48,7 @@ valSetLen = int(len(dataset.df) * config.VAL_SPLIT)
trainSetLen = len(dataset.df) - valSetLen
valSet = dataset.df[trainSetLen:]
(valDataset, valLoader) = dataset.get_dataloader(valSet,
transforms=valTransform, batchSize=config.PRED_BATCH_SIZE,
transforms=valTransform, batchSize=config.PREDICTION_BATCH_SIZE,
shuffle=True)
if torch.cuda.is_available():
@ -79,9 +79,9 @@ with torch.no_grad():
preds = model(images)
# Loop over each element in the batch
for i in range(0, config.PRED_BATCH_SIZE):
for i in range(0, config.PREDICTION_BATCH_SIZE):
# Initalize a subplot
ax = plt.subplot(config.PRED_BATCH_SIZE, 1, i + 1)
ax = plt.subplot(config.PREDICTION_BATCH_SIZE, 1, i + 1)
# De-normalize image, scale the pixel range to 255
image = images[i]
@ -109,5 +109,5 @@ with torch.no_grad():
plt.tight_layout()
date = datetime.datetime.now()
dateString = str(date.year) + "-" + str(date.month) + "-" + str(date.day) + "_" + str(date.hour) + "-" + str(date.minute) + "-" + str(date.second)
PLOT_PATH = os.path.join(PLOT_PATH, "predict-plot-" + dateString + ".png")
plt.savefig(PLOT_PATH)
plotPath = os.path.join(plotPath, "predict-plot-" + dateString + ".png")
plt.savefig(plotPath)

BIN
src/experiments/data/artificial-blurry.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
src/experiments/data/artificial-high-exposure.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
src/experiments/data/artificial-low-contrast.png (Stored with Git LFS) Normal file

Binary file not shown.

BIN
src/experiments/data/artificial-normal.png (Stored with Git LFS) Normal file

Binary file not shown.

View file

@ -0,0 +1,56 @@
import pandas as pd
import os
from autophotographer.cnn.dataset import load_ratings_df
from autophotographer.cnn.dataset import load_images_df
from autophotographer import autophotographer_main
import matplotlib.pyplot as plt
import datetime
import numpy as np
imagesDir = "/datasets/AVA/images/images"
scriptDir = os.path.dirname(__file__)
ratingsPath = os.path.abspath(os.path.join(scriptDir, "../../", "data/ratings.txt"))
image_formats = (".jpg", ".jpeg", ".png")
numSamples = 50
filters = ['filesize', 'brightness', 'contrast', 'focus']
# load ava images and ratings
imagesDf = load_images_df(imagesDir, image_formats)
ratingsDf = load_ratings_df(ratingsPath)
datasetDf = imagesDf.merge(ratingsDf, on='id', how='left')
ratingDiffs = []
for run in range(1, 20):
print("RUN {}".format(run))
# get a random sample
dfSample = datasetDf.sample(numSamples)
sampleAvgRating = float(dfSample.mean()["rating"])
filtered_paths = autophotographer_main.main(dfSample["path"], filters)
# collected filtered ratings
filtered_ratings=[]
for path in filtered_paths:
row = (dfSample.index[dfSample["path"] == path])[0]
filtered_ratings.append(float(dfSample.at[row, "rating"]))
filteredAvgRating = float(pd.DataFrame(filtered_ratings).mean())
ratingDiff = sampleAvgRating - filteredAvgRating
ratingDiffs.append(ratingDiff)
if filteredAvgRating > sampleAvgRating:
print("Rating improved - Sample: {}, Filtered: {}".format(sampleAvgRating, filteredAvgRating))
else:
print("Rating did not improve - Sample: {}, Filtered: {}".format(sampleAvgRating, filteredAvgRating))
x = np.array(range(1, 20))
y = np.array(ratingDiffs)
plt.plot(x,y)
plt.xlabel("run")
plt.ylabel("rating diff")
outputDir = os.path.join(os.path.dirname(__file__), "output")
fileName = datetime.datetime.now().strftime("%G-%m-%dT%H%M%S") + "_" + "experiment-1" + ".png"
filePath = os.path.join(outputDir, fileName)
plt.savefig(filePath)

View file

@ -0,0 +1,40 @@
# check if filter order makes a difference
from autophotographer.cnn.dataset import load_ratings_df
from autophotographer.cnn.dataset import load_images_df
from autophotographer import autophotographer_main
import os
import matplotlib.pyplot as plt
import numpy as np
import datetime
imagesDir = "/datasets/AVA/images/images"
scriptDir = os.path.dirname(__file__)
ratingsPath = os.path.abspath(os.path.join(scriptDir, "../../", "data/ratings.txt"))
image_formats = (".jpg", ".jpeg", ".png")
numSamples = 50
imagesDf = load_images_df(imagesDir, image_formats)
ratingsDf = load_ratings_df(ratingsPath)
datasetDf = imagesDf.merge(ratingsDf, on='id', how='left')
dfSample = datasetDf.sample(numSamples, random_state=5050)
print(__file__)
filters = ['focus']
focus_reduced = len(autophotographer_main.main(dfSample["path"], filters))
filters = ['filesize']
filesize_reduced = len(autophotographer_main.main(dfSample["path"], filters))
filters = ['brightness']
brighness_reduced = len(autophotographer_main.main(dfSample["path"], filters))
filters = ['contrast']
contast_reduced = len(autophotographer_main.main(dfSample["path"], filters))
x = np.array(["focus", "filesize", "brighness", "contrast"])
y = np.array([focus_reduced, filesize_reduced, brighness_reduced, contast_reduced])
plt.bar(x,y)
outputDir = os.path.join(os.path.dirname(__file__), "output")
fileName = datetime.datetime.now().strftime("%G-%m-%dT%H%M%S") + "_" + "experiment-2" + ".png"
filePath = os.path.join(outputDir, fileName)
plt.savefig(filePath)

View file

@ -0,0 +1,93 @@
from autophotographer.cnn.predict import predict
import os
import pandas as pd
import cv2
import datetime
from tqdm import tqdm
import matplotlib.pyplot as plt
scriptDir = os.path.dirname(__file__)
datasetsPath = "/datasets"
modelPath = os.path.abspath(os.path.join(scriptDir, "../", "output/22-03-2022/model.pth"))
videoDirPath = os.path.abspath(os.path.join(datasetsPath, "Hannah/" ))
filenames = os.listdir(videoDirPath)
image_formats = (".jpg", ".jpeg", ".png")
# iterate over each of the "journey" directories
for directory in [
"overcastjuly/melindwr1",
"overcastjuly/melindwr2",
"sunnyaugust/camels hump",
"sunnyaugust/diggers end",
"sunnyaugust/drunken druid",
"sunnyaugust/hippety hop",
"sunnyaugust/melindwr1",
"sunnyaugust/melindwr2",
"sunnyaugust/spaghetti junction"]:
videoDirPath = os.path.abspath(os.path.join(datasetsPath, "Hannah/", directory))
# list all file names
filenames = os.listdir(videoDirPath)
paths = []
# only retrieve the images
for filename in filenames:
if filename.lower().endswith(image_formats):
path = os.path.join(videoDirPath, filename)
paths.append(path)
# make predictions on the images
predictions = predict(paths, modelPath)
df = pd.DataFrame(list(predictions.items()), columns = ["path", "prediction"])
df = df.sort_values(by=["prediction"])
# retrieve the 5 highest ranking and 5 lowest ranking predictions
fiveHighest = df.tail(5)
fiveLowest = df.head(5)
# plot the images
row = 2
column = 5
width = 250
size = width * row
outputDir = os.path.join(os.path.dirname(__file__), "output")
fileName = datetime.datetime.now().strftime("%G-%m-%dT%H%M%S") + "_" + "experiment-3" + directory.replace("/", "-") + ".png"
filePath = os.path.join(outputDir, fileName)
fig = plt.figure(figsize=(15, 6))
# Plot the 5 highest on the first row
plotIndex = 1
for index in tqdm(fiveHighest.index):
path = fiveHighest["path"][index]
rating = fiveHighest["prediction"][index]
image = cv2.imread(path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
height = int(image.shape[0] * (width / image.shape[0]))
image = cv2.resize(image, (width, height))
fig.add_subplot(row, column, plotIndex)
plt.imshow(image)
plt.axis('off')
plt.title(rating)
plt.tight_layout()
plotIndex += 1
# Plot the 5 lowest on the second row
for index in tqdm(fiveLowest.index):
path = fiveLowest["path"][index]
rating = fiveLowest["prediction"][index]
image = cv2.imread(path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
height = int(image.shape[0] * (width / image.shape[0]))
image = cv2.resize(image, (width, height))
fig.add_subplot(row, column, plotIndex)
plt.imshow(image)
plt.axis('off')
plt.title(rating)
plt.tight_layout()
plotIndex += 1
fig.tight_layout()
plt.savefig(filePath)

View file

@ -0,0 +1,59 @@
from autophotographer.cnn.predict import predict
import os
import pandas as pd
import cv2
import datetime
from tqdm import tqdm
import matplotlib.pyplot as plt
scriptDir = os.path.dirname(__file__)
modelPath = os.path.abspath(os.path.join(scriptDir, "../", "output/22-03-2022/model.pth"))
experimentDataPath = os.path.join(scriptDir, "data/")
image_formats = (".jpg", ".jpeg", ".png")
# list all file names
filenames = os.listdir(experimentDataPath)
paths = []
# only retrieve the images
for filename in filenames:
if filename.lower().endswith(image_formats):
path = os.path.join(experimentDataPath, filename)
paths.append(path)
# make predictions on the images
predictions = predict(paths, modelPath)
df = pd.DataFrame(list(predictions.items()), columns = ["path", "prediction"])
df = df.sort_values(by=["prediction"])
# plot the images
row = 1
column = 4
width = 250
size = width * row
outputDir = os.path.join(os.path.dirname(__file__), "output")
fileName = datetime.datetime.now().strftime("%G-%m-%dT%H%M%S") + "_" + "experiment-4" + ".png"
filePath = os.path.join(outputDir, fileName)
fig = plt.figure(figsize=(15, 6))
# Plot the 5 highest on the first row
plotIndex = 1
for index in tqdm(df.index):
path = df["path"][index]
rating = df["prediction"][index]
image = cv2.imread(path)
image = cv2.cvtColor(image, cv2.COLOR_BGR2RGB)
height = int(image.shape[0] * (width / image.shape[0]))
image = cv2.resize(image, (width, height))
fig.add_subplot(row, column, plotIndex)
plt.imshow(image)
plt.axis('off')
plt.title(rating)
plt.tight_layout()
plotIndex += 1
fig.tight_layout()
plt.savefig(filePath)

Some files were not shown because too many files have changed in this diff Show more