MAML小结果显示学习算法解读及代码实现
发布时间:2025-02-19
MAML建模正则表达式
建模也就是说元进修(Model-Agnostic Meta-Learning,简称MAML)正则表达式[7],其建模也就是说体现在,都能与任何采用了反美向减少法的建模都与兼容,广泛运用于各种各不都与异的机器进修使命,包括图象类群、前提测定、强化进修等。元进修的前提,是在大量各不都与异的使命上培训一个建模,使其都能采用极少量的培训至少据库(即小抽样),顺利进行极少量的反美向减少步至少,就都能进一步适应新使命,应对新难题。
建模步骤
MAML正则表达式的培训再一前提是给予各别最优的参数调用参至少,使得建模都能加速兼容(fast adaptation)新使命。作者认为,某些构造比另一些构造易于迁移到其他使命之前,这些构造具串连使命之间的优点。既然小抽样进修使命只提供少量标记抽样,建模在小抽样上多轮迭代培训后必然导致过拟合,那么就应当尽可能使建模只迭代培训几步。这就要求建模之前具广泛兼容于各种使命的参数调用参至少,这组参至少应举例来说建模在基础上集上领悟到的本体论知识。
论点建模可以用函至少θ 表示,θ 为建模参至少。兼容新使命 时,建模通过反美向减少法迭代一步(或若干步),参至少θ 新版本为θ ,即 θ θ α θ
其之前, α 为超参至少,用于控制兼容过程的进修率。
在多个各不都与异使命上,建模通过测算θ 的财产损失来评核建模参至少 θ 。具体地,元进修的前提是给予各别参至少θ ,使得建模在使命分布上,都能加速兼容所有使命,使得财产损失最大者。用不等式表达如下:
通过随机反美向减少(SGD)法,建模参至少 θ 按照表列出不等式顺利进行新版本:
这里并不需要警惕,我们再一要优化的参至少是 θ ,但测算财产损失函至少却是在这两项后的参至少θ 上顺利进行,培训过程可通过表示意。
由于上述元进修正则表达式在财产损失测算和优化参至少方面的不同之处,培训包括了两层可逆。外层可逆是元进修过程,通过在使命分布上谐波各别使命,测算在这组使命上的财产损失函至少;内层可逆是这两项过程,即针对每一个使命,迭代一次(或若干次)反美向减少,将参至少顺利进行新版本为θ ,然后测算在参至少为 θ 时的财产损失。反美向反美向发送至时,并不需要串连越两层可逆发送至到初始参至少θ上,启动元进修的参至少新版本。
零碎的MAML正则表达式如表请警惕。
试验结果
在Omniglot和miniImageNet至少据库集上,文献给出的试验结果如表请警惕。
飞桨充分利用
本小节给出本人在“飞桨学术著作复现表演赛(第三期)”之前启动的部分极为重要字符串。零碎工程项目字符串已在GitHub和AI Studio上OpenBSD,欢迎大众star、fork。极为重要字如下:
GitHub重定向:
AI Studio重定向:
极为重要字符串充分利用
该建模非常特殊,反美向并不需要穿过内外两层可逆发送至到完整参至少。如果基于nn.Layer类顺利进行常规的建模搭建,在内可逆新版本反美向时,建模参至少都会被包含,导致初始参至少丢失。相比之下飞桨动态图模式灵活性组网的不同之处,本工程项目将建模参至少和微分分离设计,在外可逆之前保存完整参至少拷贝 θ ;内可逆之前通过该拷贝新版本参至少,测算财产损失函至少。测算图通过动态图模式自动实现,再一将反美向反美传回完整参至少θ 。
MAML类的字符串如下:
1classMAML(paddle.nn.Layer):
2def 短时init短时(self, n_way):
3super(MAML, self).短时init短时
4# 定义建模之前仅有部待优化参至少
5self.vars = []
6self.vars_bn = []
7# ------------------------第1个conv2d-------------------------
8weight = paddle. static.create_parameter(shape=[ 64, 1, 3, 3],
9dtype= 'float32',
10default_initializer=nn.initializer.KaimingNormal,
11is_bias= False)
12bias = paddle. static.create_parameter(shape=[ 64],
13dtype= 'float32',
14is_bias= True) # 参数调用为零
15self.vars.extend([weight, bias])
16# 第1个BatchNorm
17weight = paddle. static.create_parameter(shape=[ 64],
18dtype= 'float32',
19default_initializer=nn.initializer.Constant(value= 1),
20is_bias= False)
21bias = paddle. static.create_parameter(shape=[ 64],
22dtype= 'float32',
23is_bias= True) # 参数调用为零
24self.vars.extend([weight, bias])
25running_mean = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
26running_var = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
27self.vars_bn.extend([running_mean, running_var])
28# ------------------------第2个conv2d------------------------
29weight = paddle. static.create_parameter(shape=[ 64, 64, 3, 3],
30dtype= 'float32',
31default_initializer=nn.initializer.KaimingNormal,
32is_bias= False)
33bias = paddle. static.create_parameter(shape=[ 64],
34dtype= 'float32',
35is_bias= True)
36self.vars.extend([weight, bias])
37# 第2个BatchNorm
38weight = paddle. static.create_parameter(shape=[ 64],
39dtype= 'float32',
40default_initializer=nn.initializer.Constant(value= 1),
41is_bias= False)
42bias = paddle. static.create_parameter(shape=[ 64],
43dtype= 'float32',
44is_bias= True) # 参数调用为零
45self.vars.extend([weight, bias])
46running_mean = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
47running_var = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
48self.vars_bn.extend([running_mean, running_var])
49# ------------------------第3个conv2d------------------------
50weight = paddle. static.create_parameter(shape=[ 64, 64, 3, 3],
51dtype= 'float32',
52default_initializer=nn.initializer.KaimingNormal,
53is_bias= False)
54bias = paddle. static.create_parameter(shape=[ 64],
55dtype= 'float32',
56is_bias= True)
57self.vars.extend([weight, bias])
58# 第3个BatchNorm
59weight = paddle. static.create_parameter(shape=[ 64],
60dtype= 'float32',
61default_initializer=nn.initializer.Constant(value= 1),
62is_bias= False)
63bias = paddle. static.create_parameter(shape=[ 64],
64dtype= 'float32',
65is_bias= True) # 参数调用为零
66self.vars.extend([weight, bias])
67running_mean = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
68running_var = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
69self.vars_bn.extend([running_mean, running_var])
70# ------------------------第4个conv2d------------------------
71weight = paddle. static.create_parameter(shape=[ 64, 64, 3, 3],
72dtype= 'float32',
73default_initializer=nn.initializer.KaimingNormal,
74is_bias= False)
75bias = paddle. static.create_parameter(shape=[ 64],
76dtype= 'float32',
77is_bias= True)
78self.vars.extend([weight, bias])
79# 第4个BatchNorm
80weight = paddle. static.create_parameter(shape=[ 64],
81dtype= 'float32',
82default_initializer=nn.initializer.Constant(value= 1),
83is_bias= False)
84bias = paddle. static.create_parameter(shape=[ 64],
85dtype= 'float32',
86is_bias= True) # 参数调用为零
87self.vars.extend([weight, bias])
88running_mean = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
89running_var = paddle.to_tensor(np.zeros([ 64], np.float32), stop_gradient= True)
90self.vars_bn.extend([running_mean, running_var])
91# ------------------------仅有连接层------------------------
92weight = paddle. static.create_parameter(shape=[ 64, n_way],
93dtype= 'float32',
94default_initializer=nn.initializer.XierNormal,
95is_bias= False)
96bias = paddle. static.create_parameter(shape=[n_way],
97dtype= 'float32',
98is_bias= True)
99self.vars.extend([weight, bias])
100
101def forward(self, x, params=None, bn_training= True):
102ifparams isNone:
103params = self.vars
104weight, bias = params[ 0], params[ 1] # 第1个CONV层
105x = F.conv2d(x, weight, bias, stride= 1, padding= 1)
106weight, bias = params[ 2], params[ 3] # 第1个BN层
107running_mean, running_var = self.vars_bn[ 0], self.vars_bn[ 1]
108x = F.batch_norm(x, running_mean, running_var, weight=weight, bias=bias, training=bn_training)
109x = F.relu(x) # 第1个relu
110x = F.max_pool2d(x, kernel_size= 2) # 第1个MAX_POOL层
111weight, bias = params[ 4], params[ 5] # 第2个CONV层
112x = F.conv2d(x, weight, bias, stride= 1, padding= 1)
113weight, bias = params[ 6], params[ 7] # 第2个BN层
114running_mean, running_var = self.vars_bn[ 2], self.vars_bn[ 3]
115x = F.batch_norm(x, running_mean, running_var, weight=weight, bias=bias, training=bn_training)
116x = F.relu(x) # 第2个relu
117x = F.max_pool2d(x, kernel_size= 2) # 第2个MAX_POOL层
118weight, bias = params[ 8], params[ 9] # 第3个CONV层
119x = F.conv2d(x, weight, bias, stride= 1, padding= 1)
120weight, bias = params[ 10], params[ 11] # 第3个BN层
121running_mean, running_var = self.vars_bn[ 4], self.vars_bn[ 5]
122x = F.batch_norm(x, running_mean, running_var, weight=weight, bias=bias, training=bn_training)
123x = F.relu(x) # 第3个relu
124x = F.max_pool2d(x, kernel_size= 2) # 第3个MAX_POOL层
125weight, bias = params[ 12], params[ 13] # 第4个CONV层
126x = F.conv2d(x, weight, bias, stride= 1, padding= 1)
127weight, bias = params[ 14], params[ 15] # 第4个BN层
128running_mean, running_var = self.vars_bn[ 6], self.vars_bn[ 7]
129x = F.batch_norm(x, running_mean, running_var, weight=weight, bias=bias, training=bn_training)
130x = F.relu(x) # 第4个relu
131x = F.max_pool2d(x, kernel_size= 2) # 第4个MAX_POOL层
132x = paddle.reshape(x, [x.shape[ 0], -1]) ## flatten
133weight, bias = params[ -2], params[ -1] # linear
134x = F.linear(x, weight, bias)
135output = x
136returnoutput
137
138def parameters(self, include_sublayers= True):
139returnself.vars
元进修器类的字符串如下:
1classMetaLearner(nn.Layer):
2def短时init短时(self, n_way, glob_update_step, glob_update_step_test, glob_meta_lr, glob_base_lr):
3super(MetaLearner, self).短时init_ _
4self.update_step = glob_update_step # task-level inner update steps
5self.update_step_test = glob_update_step_test
6self.net = MAML(n_way=n_way)
7self.meta_lr = glob_meta_lr # 外可逆进修率
8self.base_lr = glob_base_lr # 内可逆进修率
9self.meta_optim = paddle.optimizer.Adam(learning_rate= self.meta_lr, parameters= self.net.parameters)
10
11defforward(self, x_spt, y_spt, x_qry, y_qry):
12task_num = x_spt.shape[ 0]
13query_size = x_qry.shape[ 1] # 75 = 15 * 5
14loss_list_qry = [ 0for_inrange( self.update_step + 1)]
15correct_list = [ 0for_inrange( self.update_step + 1)]
16
17# 内可逆反美向手动新版本,外可逆反美向采用定义好的新版本器新版本
18fori inrange(task_num):
19# 第0步新版本
20y_hat = self.net(x_spt[i], params=None, bn_training=True) # (setsz, ways)
21loss = F.cross_entropy(y_hat, y_spt[i])
22grad = paddle.grad(loss, self.net.parameters) # 测算所有loss比起参至少的反美向和
23tuples = zip(grad, self.net.parameters) # 将反美向和参至少一一对应起来
24# fast_weights这一步都与当于求了一个 heta - alpha*abla(L)
25fast_weights = list(map(lambda p:p[ 1] - self.base_lr * p[ 0], tuples))
26# 在query集上试验之前,测算统计分析
27# 这一步采用新版本前的至少据库,loss只用loss_list_qry[0],论点正确至少只用correct_list[0]
28with paddle.no_grad:
29y_hat = self.net(x_qry[i], self.net.parameters, bn_training=True)
30loss_qry = F.cross_entropy(y_hat, y_qry[i])
31loss_list_qry[ 0] += loss_qry
32pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1) # size = (75) # axis取-1也行
33correct = paddle.equal(pred_qry, y_qry[i]).numpy.sum.item
34correct_list[ 0] += correct
35# 采用新版本后的至少据库在query集上试验之前。loss只用loss_list_qry[1],论点正确至少只用correct_list[1]
36with paddle.no_grad:
37y_hat = self.net(x_qry[i], fast_weights, bn_training=True)
38loss_qry = F.cross_entropy(y_hat, y_qry[i])
39loss_list_qry[ 1] += loss_qry
40pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1) # size = (75)
41correct = paddle.equal(pred_qry, y_qry[i]).numpy.sum.item
42correct_list[ 1] += correct
43
44# 剩余新版本步至少
45fork inrange( 1, self.update_step):
46y_hat = self.net(x_spt[i], params=fast_weights, bn_training=True)
47loss = F.cross_entropy(y_hat, y_spt[i])
48grad = paddle.grad(loss, fast_weights)
49tuples = zip(grad, fast_weights)
50fast_weights = list(map(lambda p:p[ 1] - self.base_lr * p[ 0], tuples))
51
52ifk < self.update_step - 1:
53with paddle.no_grad:
54y_hat = self.net(x_qry[i], params=fast_weights, bn_training=True)
55loss_qry = F.cross_entropy(y_hat, y_qry[i])
56loss_list_qry[k + 1] += loss_qry
57else:# 对于最后一步update,要记录loss测算的反美向值,便于外可逆的反美向传播
58y_hat = self.net(x_qry[i], params=fast_weights, bn_training=True)
59loss_qry = F.cross_entropy(y_hat, y_qry[i])
60loss_list_qry[k + 1] += loss_qry
61
62with paddle.no_grad:
63pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1)
64correct = paddle.equal(pred_qry, y_qry[i]).numpy.sum.item
65correct_list[k + 1] += correct
66
67loss_qry = loss_list_qry[- 1] / task_num # 测算最后一次loss的总和
68self.meta_optim.clear_grad # 反美向清零
69loss_qry.backward
70self.meta_optim.step
71
72accs = np.array(correct_list) / (query_size * task_num) # 测算各新版本步至少acc的总和
73loss = np.array(loss_list_qry) / task_num # 测算各新版本步至少loss的总和
74returnaccs, loss
75
76deffinetunning(self, x_spt, y_spt, x_qry, y_qry):
77# assert len(x_spt.shape) == 4
78query_size = x_qry.shape[ 0]
79correct_list = [ 0for_inrange( self.update_step_test + 1)]
80
81new_net = deepcopy( self.net)
82y_hat = new_net(x_spt)
83loss = F.cross_entropy(y_hat, y_spt)
84grad = paddle.grad(loss, new_net.parameters)
85fast_weights = list(map(lambda p:p[ 1] - self.base_lr * p[ 0], zip(grad, new_net.parameters)))
86
87# 在query集上试验之前,测算统计分析
88# 这一步采用新版本前的至少据库
89with paddle.no_grad:
90y_hat = new_net(x_qry, params=new_net.parameters, bn_training=True)
91pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1) # size = (75)
92correct = paddle.equal(pred_qry, y_qry).numpy.sum.item
93correct_list[ 0] += correct
94
95# 采用新版本后的至少据库在query集上试验之前。
96with paddle.no_grad:
97y_hat = new_net(x_qry, params=fast_weights, bn_training=True)
98pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1) # size = (75)
99correct = paddle.equal(pred_qry, y_qry).numpy.sum.item
100correct_list[ 1] += correct
101
102fork inrange( 1, self.update_step_test):
103y_hat = new_net(x_spt, params=fast_weights, bn_training=True)
104loss = F.cross_entropy(y_hat, y_spt)
105grad = paddle.grad(loss, fast_weights)
106fast_weights = list(map(lambda p:p[ 1] - self.base_lr * p[ 0], zip(grad, fast_weights)))
107
108y_hat = new_net(x_qry, fast_weights, bn_training=True)
109
110with paddle.no_grad:
111pred_qry = F.softmax(y_hat, axis= 1).argmax(axis= 1)
112correct = paddle.equal(pred_qry, y_qry).numpy.sum.item
113correct_list[k + 1] += correct
114
115del new_net
116accs = np.array(correct_list) / query_size
117returnaccs
复现结果
本工程项目在Omniglot至少据库集上顺利进行了试验复现,其复现的结果如下表请警惕:
小结
本文对小抽样进修运用领域的科学研究取材、概念、常以至少据库集顺利进行了简要引介,重点阐述了MAML元进修建模的充分利用步骤、试验结果和极为重要字符串。该建模是初阶小抽样进修的必经之路,也是评核新正则表达式性能指标的支柱。熟悉并做到该当今建模,将对今后的原理科学研究或实践运用打下基础基础上。飞桨官方的小抽样进修MicrosoftPaddleFSL之前举例来说了包括测算机视觉和重构处理事件运用难题的小抽样进修应对方案,如MAML,ProtoNet,Relation Net等等,是首个基于飞桨的小抽样进修Microsoft,欢迎大家关心并一起阐释。
参考文献
[1] Vinyals O, Blundell C, Lillicrap T, et al. Matching Networks for One Shot Learning[J], 2016.
[2] Ri S, Larochelle H. Optimization as a model for few-shot learning[J], 2016.
[3] Ren M, Triantafillou E, Ri S, et al. Meta-learning for semi-supervised few-shot classification[J]. arXiv preprint arXiv:1803.00676, 2018.
[4] Oreshkin B N, Rodriguez P, Lacoste A. Tadam: Task dependent adaptive metric for improved few-shot learning[J]. arXiv preprint arXiv:1805.10123, 2018.
[5] Lake B, Salakhutdinov R, Gross J, et al. One shot learning of simple visual concepts[C]. Proceedings of the annual meeting of the cognitive science society, 2011.
[6] Wah C, Branson S, Welinder P, et al. The caltech-ucsd birds-200-2011 dataset[J], 2011.
[7] Finn C, Abbeel P, Levine S. Model-agnostic meta-learning for fast adaptation of deep networks[C]. International Conference on Machine Learning, 2017: 1126-1135.
END
真是不错,请点个在看呀
。轻微脑出血前兆上海白癜风医院哪里好
佛山看妇科医院哪家好
长沙牛皮癣医院哪个好
佛山妇科专科医院哪好
益生菌安不安全
成都曙光医院
治疗挫伤的药物有哪些
999消痔软膏好还是马应龙好
中药如何起到止咳化痰作用
上一篇: 专科学历,真的很真是吗?
-
安徽一景区不得不关闭
金寨县东兴溪地游览区闭园新闻稿为了全面加强新冠SARS防控工作,切实保障广大观光客身体健康和生命安全,自即日起,东兴溪地游览区开始闭园(含龙题诗居酒店),开园时长另行通知,特此新闻
- 2025-05-18在西藏藏族同胞家吃饭,王家端出一盘石子放餐桌中间,是怎么回事
- 2025-05-18人民:燃情冰雪,浪漫爱情中国
- 2025-05-18松软香甜的美丽俏佳人,草莓你好卷,好吃又高颜值
- 2025-05-18南岳衡山景区当前通告!
- 2025-05-18江苏盱眙:深山冰雪细 时闻欢笑声
- 2025-05-18门源地震:牦牛受惊奔逃,有人裹被子避险,有人不忘喊姐姐三人跑
- 2025-05-18“雅典奥运五环”亮相北京延庆冬奥赛区
- 2025-05-18疫情之下屡按“暂停双键”文旅业能否靠创新突围?
- 2025-05-18左江河发现一件“铜靴”,专家看罢说:其实它是一件重要铠甲
- 2025-05-18旅游博主发布“逃票”攻略?景区发言