Cham's Blog Algorithm, skill and thinking

# 机器学习科研方法论

2020-10-03

Cham’s Blog 首发原创

### Idealized Algorithm for Writing a Paper

Find problem/data

• Start writing Start writing (yes, start writing before and during research)
• Do research/solve problem
• Finish 95% draft
• Send preview to mock reviewers
• Send preview to the rival authors (virtually or literally)
• —— one month before deadline ——
• Revise using checklist.
• Submit

### What Makes a Good Research Problem?

• It is important: If you can solve it, you can make money, or save lives, or help children learn a new language, or…
• You can get real data: Doing DNA analysis of the Loch Ness Monster would be interesting, but…
• You can make incremental progress: Some problems are all all-or or-nothing. Such problems may be too risky for young nothing
• There is a clear metric for success: Some problems fulfill the criteria above, but it is hard to know when you are making progress on them

### Finding Research Problems

Suppose you think idea $X$ is very good, then can you extend $X$ by…

• Making it more accurate (statistically significantly more accurate)
• Making it faster (usually an order of magnitude, or no one cares)
• Making it simpler
• Explaining why it works so well
• Making it an anytime algorithm
• Applying it in a novel setting (industrial/government track)
• Removing a parameter/assumption
• Making it an online (streaming) algorithm
• Making it work for a different data type (including uncertain data)
• Making it work for distributed systems
• Making it disk-aware (if it is currently a main memory algorithm)
• Making it work on low powered devices

### Paper Management

• 读好文章，benchmark 刷分的 $\rightarrow$ 提出好点子的 $\rightarrow$ 挖新坑的 $\rightarrow$ 推动领域关键问题的
• 阅读前首先检查实验设置是否合理，baseline 是否可信，可重复性是否强，否则理论不可信
• 根据 introduction 了解作者对问题的定义和描述，以及对所在领域的积累
• 算法层面不看方法特复杂且无公开源码的，一般是作者对问题的定义有冗余或保密缘由
• 阅读文章要尽量 abstract 以及实验部分当天看完，后续选择是否细读或者复现。

### Other Tips

• 别恰剩饭或者很明显的灌水，有些TL的网红都快把这个领域的年轻人带没了
• 不要把毕业论文对方向的统一性当作束缚，达到毕业要求后多做些自己感兴趣的
• 勿好高骛远，脚踏实地做好实验、阅读