This repository has been archived on 2022-07-15. You can view files and clone it, but cannot push or open issues or pull requests.
mmp-osp1/personal-log.md

34 lines
2.5 KiB
Markdown
Raw Normal View History

2022-02-05 23:02:52 +00:00
# WC 2021-01-31
This week is the first week of the project. I researched academic papers, existing code and dataset relating to the topic of determining aesthetics.
**Papers:**
[Photo Aesthetics Analysis via DCNN Feature Encoding](https://ieeexplore.ieee.org/document/7886320) - Predicting aesthetic performance using a bespoke CNN solution
> H. -J. Lee, K. -S. Hong, H. Kang and S. Lee, "Photo Aesthetics Analysis via DCNN Feature Encoding," in IEEE Transactions on Multimedia, vol. 20, no. 8, pp. 1921-1932, Aug. 2017, doi: 10.1109/TMM.2017.2687759.
[AVA: A large-scale database for aesthetic visual analysis](https://ieeexplore.ieee.org/document/6247954) - Making of an aestehtic visual analysis dataset
> N. Murray, L. Marchesotti and F. Perronnin, "AVA: A large-scale database for aesthetic visual analysis," 2012 IEEE Conference on Computer Vision and Pattern Recognition, 2012, pp. 2408-2415, doi: 10.1109/CVPR.2012.6247954.
**Code:**
[Image Quality Assessment](https://github.com/idealo/image-quality-assessment) - Convolutional Neural Networks to predict the aesthetic and technical quality of images.
**Datasets:**
[AADB](https://github.com/aimerykong/deepImageAestheticsAnalysis)
AVA: https://github.com/imfing/ava_downloader, https://github.com/ylogx/aesthetics/tree/master/data/ava
## Project idea from research
Based on the research, I decided a machine learning approach would result in higher quality outputs. Although, I was slightly concerned that following a deep-learning would limit interesting discussion in my report.
The idea was to create a program that can take a video, break it down into frames and use a trained CNN to predict the most aesthetic frames and return them to the user.
## Weekly 1:1 meeting
During the meeting I mentioned my concerns following a deep learning approach. Although this approach might provide quality results, it doesn't provide much room to discuss or develop interesting solutions. Instead, as Hannah put, it mostly depends on throwing the problem at powerful hardware to get the best output which doesn't make for an interesting project. Hannah suggested I take a hybrid approach where I could use deep-learning for the last step in the pipeline, depending more on conventional engineering techniques to reduce the input data before passing it to the deep-learning stage.
She mentioned 'dumb' ways in which I could reduce the set of input frames:
- Comparing file sizes and removing the small ones (might infer single colour images / less complex images)
- Fourier frequency analysis
- Brightness and contrast analysis