laarc.io. Looking for AI work. DMs open. ML discord: discordapp.com/invite/x52Xz3…

Seattle, WA
Joined January 2009
#laarc now has a "suggest title" link on the submission page. Submit a story! laarc.io/ Thanks to the lobste.rs/ crew for suggesting this.
2
1
0
21
Shawn Presser retweeted
teaching my pet rats how to operate simple machines, such as levers and screws
1
2
0
32
757
Someone did this! "Are Pre-trained Convolutions Better than Pre-trained Transformers?" abs: arxiv.org/abs/2105.03322 > The difference when compared with Transformers is that we replace the multi-headed self-attention with convolutional blocks.
Are Pre-trained Convolutions Better than Pre-trained Transformers? pdf: arxiv.org/pdf/2105.03322.pdf abs: arxiv.org/abs/2105.03322 experimental results show that convolutions can outperform Transformers in both pretrain and non-pre-trained setups
0
0
0
5
Show this thread
What’s wrong with being a puzzle-solver?
Remember when you first read The Structure of Scientific Revolutions and resolved that you, at least were not going to be a mere puzzle-solver?
5
0
0
9
Shawn Presser retweeted
Could this be a better metric than the FID for generative modelling? : The proportion of fake data for which the nearest neighbors in LPIPS feature space is in the test set rather than within other fake samples. From the paper: arxiv.org/pdf/2103.01946.pdf.
3
8
0
38
Show this thread
Shawn Presser retweeted
Blog post: One-year monitoring of the machine learning community in Twitter delanover.com/about/blog/one…
1
4
1
28
Show this thread
one-hot student teacher fill me with all your knowledge senpai~
0
0
0
5
Funny thought: maybe OpenAI isn’t releasing DALL-E because it generates porn most of the time, since they forgot to filter it out of their training set, and they’re sheepishly retraining before anybody notices. (After all, NSFW filters are imperfect, so API access is a risk.)
4
1
0
43
Shawn Presser retweeted
In case someone gets bored (not a particularly nice result ...): What is the smallest fraction of the large blue square that the red and orange squares occupy together?
0
2
0
5
GIF
Darn. I waited ~7 years to get to 6969 unread Reddit replies, then managed to miss it by one. Oh well, 7777 will be in a year or so. Anyway: "nice."
3
0
0
19
Shawn Presser retweeted
JAX-ResNet is now available on PyPI and includes implementations and ImageNet checkpoints for the following models: ResNet [18, 34, 50, 101, 152] WideResNet [50, 101] ResNeXt [50, 101] ResNet-D [50] ResNeSt [50-Fast, 50, 101, 200, 269] Check it out! github.com/n2cholas/jax-resn…
2
21
0
130
How it started How it’s going Also insert joke about feature creep
0
0
0
5
Shawn Presser retweeted
0
2
0
7
This space filling algorithm pleases me greatly Took all day, but worth the payoff
1
0
0
12
Found an exhaustive treatise on data analysis Just kidding, it’s 37 pages. It’s the tiniest technical book I’ve ever seen. It was even published by oreilly.
2
0
0
15
Picked out the most interesting books from storage. Ended up with these No bullshit guide to math A retargettable c compiler Practical signal processing
4
4
1
129
Today I resigned from Basecamp. I never worked there to begin with, but I’m hoping no one notices so I get 6 months of salary like everybody else that left. For real though, who wouldn’t accept $75k minus tax to leave your job? You’d have to really love your job.
4
2
0
84
Idea for avoiding local minima: Fork a training run, randomly reinitialize small parts of the model (e.g. a small portion of the middle layer), train for a bit, then average with an old checkpoint. It may not sound like much, but if you fork 256 training runs, it converges.
2
1
0
18
The reason this is different from dropout is because dropout works on activations, not weights. This is "weight dropout": randomly scrambling some of the weight values, forcing the model to move its knowledge into other parts. The knowledge isn't destroyed; it's averaged.
3
0
0
12