知智读行

Learn about beam search algorithm

Posted on 2018-07-17

Beam search 是seq2seq中decoder的常用算法，用于生成sequence，比greedy decoding 效果更好，但是计算量更大，也更耗时。下面是Transformer model中的beam search 中的程序的分析。

The Illustrated Transformer

Posted on 2018-07-16

This post sources from
Jay Alammar’s blog, it is a great post to illustrate the transformer nmt model and provides lots of good references. enjoy.

In the previous post, we looked at Attention – a ubiquitous method in modern deep learning models. Attention is a concept that helped improve the performance of neural machine translation applications. In this post, we will look at The Transformer – a model that uses attention to boost the speed with which these models can be trained. The Transformers outperforms the Google Neural Machine Translation model in specific tasks. The biggest benefit, however, comes from how The Transformer lends itself to parallelization. It is in fact Google Cloud’s recommendation to use The Transformer as a reference model to use their Cloud TPU offering. So let’s try to break the model apart and look at how it functions.

The Transformer was proposed in the paper Attention is All You Need. A TensorFlow implementation of it is available as a part of the Tensor2Tensor package. Harvard’s NLP group created a guide annotating the paper with PyTorch implementation. In this post, we will attempt to oversimplify things a bit and introduce the concepts one by one to hopefully make it easier to understand to people without in-depth knowledge of the subject matter.

A High-Level Look

Let’s begin by looking at the model as a single black box. In a machine translation application, it would take a sentence in one language, and output its translation in another.

Nginx配置实践

Posted on 2018-07-12

Nginx 是开源的网站布署工具。传统的网站平台时LAMP，即linux + apache + mysql + php，新兴的网站也使用LNMP，其中的N即Nginx。相对于apache，Nginx的优势在于轻量化，并发数高，配置简单，容易实现负载平衡和反向代理，与node.js和python配合方便，非常适合静态网站。但是由于软件较新，稳定性方面有待进一步验证，与php的配合需要其他工具辅助。所以现在一般前端用nginx作为反向代理抗住压力，apache作为后端处理动态请求。 nginx的常 ...

pytorch中tensor维度操作

Posted on 2018-07-10

在pytorch中，需要对tensor进行各种型式的维度、复制等操作，比如squeeze, view, gather,expand,contiguous,copy等,下面对这些操作进行介绍。 1) squeeze & unsqueezetorch.squeeze(input, dim=None, out=None)是将输入张量形状中的1 去除并返回。输入是形如(A×1×B×1×C×1×D)，那么输出形状就为： (A×B×C×D)。当给定dim时，那么挤压操作只在给定维度上。例如，输 ...

python绘图工具包Matplotlib的使用笔记

Posted on 2018-07-04

如何使用Matplotlib画分组条形图，以及一些参考资料。

自然语言处理入门1

Posted on 2018-06-18

在网上搜了一些资料，放在这里当作备份。自然语言处理研究内容自然语言处理（简称NLP），是研究计算机处理人类语言的一门技术，包括：句法语义分析：对于给定的句子，进行分词、词性标记、命名实体识别和链接、句法分析、语义角色识别和多义词消歧。信息抽取：从给定文本中抽取重要的信息，比如，时间、地点、人物、事件、原因、结果、数字、日期、货币、专有名词等等。通俗说来，就是要了解谁在什么时候、什么原因、对谁、做了什么事、有什么结果。涉及到实体识别、时间抽取、因果关系抽取等关键技术。文本挖掘（或者文 ...

tensorFlow安装的问题及方法

Posted on 2018-05-27

参考https://www.jianshu.com/p/df677e3fd630 卸载的方法：显示已经装的包，dpkg -l | grep nvidia关闭Xorg的显示：123sudo service lightdm stopsudo init 3sudo killall xorg 12345sudo apt-get --purge remove cuda*sudo apt-get purge nvidia-*sudo apt-get purge cudasudo apt-get pu ...

pytorch hook函数

Posted on 2018-05-23

转自：http://mayuexin.me/2017/11/Pytorch-Note1 pytorch中hook有三种函数，分别是register_hook, register_backward_hook, register_forward_hook，第一个是针对Variable，后面两个是针对modules的 register_hook函数针对中间层的Variable的梯度进行处理，比如修改和打印 123456789101112131415161718# 打印中间层Variable的梯 ...

如果可以再上一次上海财经大学

Posted on 2018-05-19 | In thoughts

扪心自问，如果可以重来，我会怎么在校园里度过自己的青春时光（make my golden years gold)? 1. 空麻袋立不住，人生必须有真本事(Business-supported life) 与理工科等立竿见影的硬科学相比，财经大学（商科）的要害之处是更容易流于虚浮。电气毕业生要玩转电机和变压器才能活下来，计算机工程师要让代码跑起来才能有饭吃。但商科毕业生却往往没有真才实学也能混段日子，但空麻袋立不住，对客户、社会和自身的后果也是更加致命的。什么是真本事，就是能够真正帮到别人，而这 ...

python中文件内容的读写

Posted on 2018-05-18

python中文件读写是常见的操作，常用的文件打开方式是下面这种： 1234with open(file_path,'r+') as f: for line in f: # your code here 如果要同时打开2个或多个文件，以同时对里面的内容操作，可以采用： 1234567with open(file1_path,'r+') as f1: with open(file2_path,'r+') as ...

Practical advice for analysis of large, complex data sets

Posted on 2018-05-03

By PATRICK RILEY For a number of years, I led the data science team for Google Search logs. We were often asked to make sense of confusing results, measure new phenomena from logged behavior, validate analyses done by others, and interpret metrics ...

机器翻译中参数的初始化

Posted on 2018-04-25

motivationThe widely used neural machine translation model is seq2seq model with encoder and decoder. The Transformer model (2017) has reached new state of the art in nmt. It has many parameters to decide and train: hyper-parameters and model paramet ...

读书摘抄

Posted on 2018-04-21

@@Business model 要回答关于value的三个问题： value proposition, what services being sold and to whom value creation, how will the service be created and provided. value capture, how will the value be monetised. @@在我的认识里，中国有三块地方很值得行走。一个是山西的运城和临汾一带，二是陕西的韩城合阳朝邑一 ...

pytorch中保存和加载训练模型

Posted on 2018-04-20

source from https://stackoverflow.com/questions/42703500/best-way-to-save-a-trained-model-in-pytorch The question is what is the best way to save a trained model in PyTorch. The best answer is as follows: It depends what you want to do. Case # 1: S ...

paper related to NLP applications

Posted on 2018-04-12

source from https://github.com/lipiji/App-DL Startups 机器学习、深度学习、计算机视觉、大数据创业公司 - Startups in AI Deep Reinforcement Learning David Silver. “Tutorial: Deep Reinforcement Learning.” ICML 2016. David Silver’s course. “Reinforcement Learning“. 2015. Bah ...