(补充)爬取大西洋月刊并调用彩云小译翻译 API 脚本
生活随笔
收集整理的這篇文章主要介紹了
(补充)爬取大西洋月刊并调用彩云小译翻译 API 脚本
小編覺得挺不錯的,現在分享給大家,幫大家做個參考.
導讀
上一篇文章寫了如何爬取《The Atlantic》的新聞學習英語,這篇文章補充上一篇文章,在爬取文章段落時,同時調用翻譯接口,到達如圖所示的樣子。
如圖,翻譯的非常不錯,借助的是彩云小譯·程序猿都知道的翻譯機。以下重點就是講解如何抓包,使用彩云小譯的第三方API
問題 文章收納
- 回顧json庫的使用
- 字符串和json的相互轉換
- python爬蟲如何POST request payload形式的請求
- scrapy爬蟲注意點(1)—— scrapy.FormRequest中formdata參數
寫入文件
這里是直接寫入markdown,并添加了translate()函數翻譯,其余內容可參考上一篇文章
def to_MarkDown(header,meta,time,p_list):with open('./《Atlantic》__{}.md'.format(header[0].strip()),'w+',encoding='utf=8') as f:f.writelines('## {}'.format(header[0].strip())+'\n')f.writelines('**{}**'.format(time[0].strip())+'爬取自《The Atlantic》\n\n')f.writelines('> 導讀:**{}**'.format(meta[0].strip())+'\n\n')f.write('\n ') # for p in p_list: # f.write('\n\n '.join(p)) # f.write('\n\n ')source = []for p in p_list:for i in p:source.append(i)p_trans = translate(source)for i , j in zip(source,p_trans):f.write(' {}\n'.format(i))f.write('> {}\n\n'.format(j.strip()))print('./《Atlantic》__{}.md | 寫入成功'.format(header[0].strip()))添加彩云小譯翻譯接口
打開開發者工具(F12),抓包,很簡單,你試了之后會發現它就一個translator接口,如圖。
可知post請求,url
進一步分析post請求參數構成,如圖我紅框框出的
- X-Authorization 是必填項,——>點在這里申請,一個月可免費用100萬字,足夠了
- content-type 注意,和以往的post請求的data參數不同,現在大多數網站是依據Payload來傳參,需要改動的就是這個content-type
- Payload 參數構成看代碼,或者直接自己去分析
- 注意 trans_type 不僅僅支持英譯中(en2zh),中譯英(zh2en),還有日語
關于X-Authorization,他有一個公用的,基本上無請求限制,分析下請求就能看出
import json import time def translate(source):payload = {'source':None,'media':'text','detect':'true','trans_type':'en2zh','request_id':'demo'}payload['source'] = sourcepayload = json.dumps(payload)headers = {'X-Authorization': 'token 你的token','User-Agent': 'Mozilla/5.0 (Windows NT 10.0; WOW64) AppleWebKit/537.36 (KHTML, like Gecko) Chrome/58.0.3029.110 Safari/537.36 SE 2.X MetaSr 1.0','content-type': 'application/json','Referer':'http://fanyi.caiyunapp.com/'}url = 'https://api.interpreter.caiyunai.com/v1/translator'response = requests.post(url,data=payload,headers=headers)time.sleep(5)if response.status_code == 200:return json.loads(response.text)['target'] response.status_code 200 translate_p = json.loads(response.text)['target'] len(translate_p) 18 p_s = [] for i in p:for j in i:p_s.append(j) for i,j in zip(p_s,translate_p):print(' {}\n'.format(i))print('> {}\n\n'.format(j.strip())) Last year, the world learned that researchers led by David Evans from the University of Alberta had resurrected a virus called horsepox. The virus hasn’t been seen in nature for decades, but Evans’s team assembled it using genetic material that they ordered from a company that synthesizes DNA.> 去年,全世界都知道,由大衛 · 埃文斯領導的阿爾伯塔大學研究人員復活了一種叫水痘的病毒。 這種病毒在自然界已經有幾十年沒有出現過了,但是埃文斯的研究小組用他們從合成 DNA 的公司訂購的基因材料進行組裝。 The work caused a huge stir. Horsepox is harmless to people, but its close cousin, smallpox, killed hundreds of millions before being eradicated in 1980. Only two stocks of smallpox remain, one held by Russia and the other by the U.S. But Evans’s critics argued that his work makes it easier for others to recreate smallpox themselves, and, whether through accident or malice, release it. That would be horrific: Few people today are immunized against smallpox, and vaccine reserves are limited. Several concerned parties wrote letters urging scientific journals not to publish the paper that described the work, but PLOS One did so in January.> 這項工作引起了巨大的轟動。 馬瘟對人體無害,但它的近親天花在1980年被根除之前已經殺死了上億人。 只有兩種天花存在,一種由俄羅斯持有,另一種由美國持有。 但是埃文斯的批評者認為,他的工作使得其他人更容易自己重建天花,并且,無論是意外還是惡意,都會釋放出來。 這將是可怕的: 今天很少有人接種天花疫苗,而且疫苗儲備有限。 一些有關方面寫信敦促科學期刊不要發表描述這項研究的論文,但是《公共科學圖書館 · 綜合》一月份就這樣做了。 This controversy is the latest chapter in an ongoing debate around “dual-use research of concern”—research that could clearly be applied for both good and ill. More than that, it reflects a vulnerability at the heart of modern science, where small groups of researchers and reviewers can make virtually unilateral decisions about experiments that have potentially global consequences, and that everyone else only learns about after the fact. Cue an endlessly looping GIF of Jurassic Park’s Ian Malcolm saying, “Your scientists were so preoccupied with whether or not they could, they didn’t stop to think if they should.”> 這一爭議是圍繞"雙重用途的關注性研究"正在進行的辯論中的最新一章,這項研究顯然可以用于善與弊。 更重要的是,它反映了現代科學核心的脆弱性,在這里,一小群研究人員和評論家可以對具有潛在全球影響的實驗作出幾乎是單方面的決定,而其他人只能在事后才知道。 《侏羅紀公園》的伊恩 · 馬爾科姆無休止地循環著,他說:"你們的科學家們一心想著他們是否能做到,他們沒有停下來思考是否應該這樣做。" Except Evans did think about whether he should, and clearly came down on yes. In one of several new opinion pieces that reflect on the controversy, he and his colleague Ryan Noyce argue that recreating horsepox has two benefits. First, Tonix, the company that funded the research, hopes to use horsepox as the basis of a safer smallpox vaccine, should that extinct threat ever be itself resurrected. Second, the research could help scientists to more efficiently repurpose poxviruses into vaccines against other diseases, or even weapons against cancer. (Evans politely declined a request for an interview, noting that he’d “rather let [his] piece speak for itself.”)> 除了埃文斯確實考慮過他是否應該這樣做,而且很明顯是的。 他和他的同事瑞恩 · 諾伊斯在一篇反思這場爭論的新觀點中提出,重建馬瘟有兩個好處。 首先,為這項研究提供資金的公司 Tonix 希望使用水痘作為一個更安全的人痘接種術的基礎,如果已經滅絕的威脅自身復活的話。 其次,這項研究可以幫助科學家更有效地將痘病毒重新用于其他疾病的疫苗中,甚至可以用來對抗癌癥。 (埃文斯禮貌地拒絕了采訪的要求,指出他"寧愿讓他的文章自己說話。") Tom Inglesby, a health-security expert at the Johns Hopkins Bloomberg School of Public Health, doesn’t buy it. He says these purported benefits are hypothetical, and could be achieved in safer ways that don’t involve horsepox at all. Even if you want to use that particular virus, the CDC has specimens in its freezers; Evans didn’t ask for those because he thought Tonix couldn’t have commercialized the naturally occurring strain into a vaccine, according to reporting from NPR’s Nell Greenfieldboyce.> 布隆博格公共衛生學院的健康安全專家湯姆?英格斯比(Tom Inglesby)不買賬。 他說,這些所謂的好處是假設的,可以通過更安全的方式來實現,而且根本不涉及水痘。 根據美國國家公共電臺的 Nell Greenfieldboyce 的報道,即使你想使用這種特殊的病毒,疾病控制中心的冰箱里也有標本; 埃文斯并沒有要求這些樣本,因為他認為 Tonix 無法將自然產生的毒株轉化為疫苗。 “I was a little surprised that the issue caused so much controversy,” says Gigi Gronvall, who has written extensively on biosecurity and also works at Johns Hopkins. Other researchers had already synthesized smaller viruses like polio, and bigger entities like bacteria; they’ve even made a start on far larger organisms like yeast. Given such milestones, one should just assume that all viruses are within reach—but only to those with the right expertise, equipment, and money. Evans didn’t just order horsepox in the mail; it took years to refine the process of making and assembling it. “It’s not like anybody could synthesize horsepox,” says Gronvall.> "我對這個問題引發如此大的爭議感到有點驚訝,"Gigi Gronvall 說,她曾經寫過大量關于生物安全的文章,同時也在約翰霍普金斯大學工作。 其他研究人員已經合成了小型病毒,比如脊髓灰質炎,還有更大的細菌,比如細菌; 他們甚至已經開始研究像酵母這樣的大得多的生物體。 考慮到這些里程碑式的事件,人們應該假設所有的病毒都可以接觸到——但是只有那些擁有正確的專業知識、設備和金錢的病毒。 埃文斯不僅在郵件中訂購馬瘟,還花了數年的時間來完善制作和組裝過程。 "這不像是任何人都可以合成馬瘟病毒的,"格隆維爾說。 True, says Kevin Esvelt from MIT, but that feat is now technically easier because Evans’s paper spelled out several details of how to do so. It’s conceptually easier to weaponize because his paper explicitly connected the dots to smallpox. And it will become logistically easier to carry out with time, as the underlying tech becomes cheaper. “In the long run, I’m worried about the technology being accessible enough,” Esvelt says.> 的確如此,來自麻省理工學院的凱文 · 埃斯維特說,但是從技術上來說,這一壯舉在技術上變得更加容易,因為埃文斯的論文詳細闡述了如何做到這一點。 因為他的論文明確地將這些點與天花聯系在一起,所以在概念上更容易武器化。 隨著基礎技術變得更加廉價,隨著時間的推移,這將變得更加容易。 "從長遠來看,我擔心的是技術是否足夠容易獲得,"Esvelt 說。 There are ways of mitigating that risk. Most groups can’t make DNA themselves, and must order sequences from companies. Esvelt thinks that all such orders should be screened against a database of problematic sequences, as a bulwark against experiments that are unknowingly or deliberately dangerous. Such screening already occurs, but only on a voluntary basis. A mandatory, universal process could work if publishers or funders boycott work that doesn’t abide by it, or if companies build the next generation of DNA synthesizers to lock if a screening step is fixed.> 有一些方法可以減輕這種風險。 大多數團隊不能自己制造 DNA,而且必須從公司訂購序列。 認為所有這些命令都應該在一個有問題的序列數據庫中進行篩選,以防止不知不覺或故意危險的實驗。 這種篩查已經發生,但只是在自愿的基礎上進行。 如果出版商或出資者抵制不遵守的工作,或者如果公司建立下一代 DNA 合成器來鎖定篩選步驟,那么一個強制性的、普遍的過程就可以奏效。 But these technological fixes do little to address the underlying debate about how society decides what kinds of experiments should be done in the first place, let alone published. Few countries have clear procedures for reviewing dual-use research. The U.S. has perhaps the strongest policy, but it still has several loopholes. It only covers 15 big, bad pathogens, and horsepox, though related to one, isn’t one itself. It also only covers federally funded research, and Evans’s research was privately funded. He did his work in Canada, but he could just as easily have done so in the U.S.> 但是這些技術上的解決辦法并沒有解決社會如何決定應該做什么樣的實驗的潛在爭論,更不用說發表了。 很少有國家有審查雙重用途研究的明確程序。 美國也許有最強硬的政策,但它仍然有一些漏洞。 它只覆蓋了15個大的、不好的病原體和馬瘟,盡管與其中一種病原體有關,但它本身并不存在。 它也只涉及聯邦資助的研究,埃文斯的研究是私人資助的。 他在加拿大做了他的工作,但他在美國也可以輕而易舉地這樣做。 Absent clearer guidelines, the burden falls on the scientific enterprise to self-regulate—and it isn’t set up to do that well. Academia is intensely competitive, and “the drivers are about getting grants and publications, and not necessarily about being responsible citizens,” says Filippa Lentzos from Kings College London, who studies biological threats. This means that scientists often keep their work to themselves for fear of getting scooped by their peers. Their plans only become widely known once they’ve already been enacted, and the results are ready to be presented or published. This lack of transparency creates an environment where people can almost unilaterally make decisions that could affect the entire world.> 如果沒有更明確的指導方針,科研企業將承擔起自我監管的責任,而且它并不能很好地做到這一點。 倫敦大學國王學院研究生物威脅的 Filippa Lentzos 說,學術界競爭激烈,"驅動因素是獲得獎學金和出版物,而不一定是要成為負責任的公民。"他研究生物威脅。 這意味著科學家常常把他們的工作留給自己,以免被同齡人挖走。 他們的計劃只有在已經頒布之后才會廣為人知,而且結果已經準備就緒,可以提交或公布。 這種缺乏透明度的做法創造了一種環境,人們幾乎可以單方面作出可能影響整個世界的決定。 Take the horsepox study. Evans was a member of a World Health Organization committee that oversees smallpox research, but only told his colleagues about the experiment after it was completed. He sought approval from biosafety officers at his university, and had discussions with Canadian federal agencies, but it’s unclear if they had enough ethical expertise to fully appreciate the significance of the experiment. “It’s hard not to feel like he opted for agencies that would follow the letter of the law without necessarily understanding what they were approving,” says Kelly Hills, a bioethicist at Rogue Bioethics.> 以馬瘟研究為例。 埃文斯是世界衛生組織的一個委員會的成員,該委員會負責監督天花的研究,但是他只是在實驗完成后才告訴他的同事。 他尋求大學生物安全官員的批準,并與加拿大聯邦機構進行了討論,但目前還不清楚他們是否具備足夠的道德專業知識,以充分認識到這項實驗的重要性。 羅格生物倫理學院的生物倫理學家凱利?希爾斯(Kelly Hills)表示:"我們很難不覺得他選擇了那些遵循法律條文的機構,而不一定了解他們所批準的內容。"。 She also sees a sense of impulsive recklessness in the interviews that Evans gave earlier this year. Science reported that he did the experiment “in part to end the debate about whether recreating a poxvirus was feasible.” And he told NPR that “someone had to bite the bullet and do this.” To Hills, that sounds like: I did it because I could do it. “We don’t accept those arguments from anyone above age six,” she says.> 她還在今年早些時候的采訪中看到了一種沖動的魯莽感。 科學報道說,他做這個實驗的部分原因是為了結束關于重建痘病毒是否可行的爭論。" 他告訴美國國家公共廣播電臺,"必須有人咬緊牙關才能做到這一點。" 對于希爾斯來說,這聽起來像是: 我這么做是因為我能做到。 "我們不接受任何六歲以上人士的觀點,"她表示。 Even people who are sympathetic to Evans’s arguments agree that it’s problematic that so few people knew about the work before it was completed. “I can’t emphasize enough that when people in the security community feel like they’ve been blindsided, they get very concerned,” says Diane DiEuliis from National Defense University, who studies dual-use research.> 即使是那些同情埃文斯論點的人也認為,在工作完成之前,很少有人知道這項工作是有問題的。 國防大學研究雙重用途研究的黛安?迪尤利斯(Diane DiEuliis)表示:"我再怎么強調也不過分,當安全部門的人覺得自己被暗算時,他們會非常擔心。"。 The same debates played out in 2002, when other researchers synthesized poliovirus in a lab. And in 2005, when another group resurrected the flu virus behind the catastrophic 1918 pandemic. And in 2012, when two teams mutated H5N1 flu to be more transmissible in mammals, in a bid to understand how that might happen in the wild. Many of the people I spoke with expressed frustration over this ethical M?bius strip. “It’s hard not to think that we’re moving in circles,” Hills says. “Can we stop saying we need to have a conversation and actually get to the conversation?”> 同樣的辯論發生在2002年,當時其他研究人員在實驗室合成了脊髓灰質炎病毒。 2005年,當另一個團體在1918年災難性的大流行病背后復活了流感病毒。 而在2012年,當兩個團隊將 H5N1病毒變異為哺乳動物更容易傳播,以期了解野生動物中可能發生的情況。 與我交談過的許多人對這個倫理道德條款表示失望。 "很難不認為我們在繞圈子,"希爾斯說。 "我們能不能不要再說我們需要進行一次談話,而是真正地開始談話嗎?" The problem is that scientists are not trained to reliably anticipate the consequences of their work. They need counsel from ethicists, medical historians, sociologists, and community representatives—but these groups are often left out from the committees that currently oversee dual-use research. “The peer group who is weighing in on these decisions is far too narrow, and these experiments have the potential to affect such a large swath of society,” says Lentzos. “I’m not saying we should flood committees with people off the streets, but there are a lot of professionals who are trained to think ethically or from a security perspective. Scientists don’t have that and it’s actually unfair that they’re being asked to make judgment calls on security issues.”> 問題在于,科學家沒有接受過可靠預測他們工作后果的訓練。 他們需要來自倫理學家、醫學歷史學家、社會學家和社區代表的建議,但這些團體往往被排除在目前負責雙重用途研究的委員會之外。 "參與這些決策的同行群體太過狹隘,這些實驗有可能影響到如此大的社會群體,"Lentzos 說。 "我并不是說我們應該讓委員會里的人走上街頭,但是有很多專業人士接受過道德或安全方面的培訓。 科學家們沒有這種能力,而且他們被要求對安全問題做出判斷是不公平的。" More broadly, Hills says, there’s a tendency for researchers to view ethicists and institutional reviewers as yet more red tape, or as the source of unnecessary restrictions that will stifle progress. Esvelt agrees. “Science is built to ascend the tree of knowledge and taste its fruit, and the mentality of most scientists is that knowledge is always good,” he says. “I just don’t believe that that’s true. There are some things that we are better off not knowing.” He thinks the scientific enterprise needs better norms around potentially dangerous information. First: Don’t spread it. Second: If someone tells you that your work represents an information hazard, “you should seriously respect their call.”> 更廣泛地說,希爾斯說,研究人員傾向于將倫理學家和機構評論家視為更多的繁文縟節,或者將其視為不必要限制的來源,而這些限制將會扼殺進步。 埃斯維特對此表示贊同。 他說:"科學的建立是為了提升知識樹,品嘗它的果實,而大多數科學家的心態是,知識永遠是好的。"。 "我只是不相信這是真的。 有些事情我們最好不要知道。" 他認為科學企業需要圍繞潛在危險信息制定更好的規范。 首先: 不要分散它。 第二: 如果有人告訴你你的工作是一種信息危害,"你應該嚴肅地尊重他們的呼吁。" Lentzos adds that scientists should be trained on these topics from the earliest stages of their careers. “It needs to start at the undergrad level, and be continually done for active researchers,” she says. There is a lot of talk about educating society about science. Perhaps what is more needed is educating scientists about society.> 他補充說,科學家應該從他們職業生涯的最初階段就開始接受有關這些主題的培訓。 "它需要從本科生開始,并且不斷地為活躍的研究人員做,"她說。 有很多關于科學教育社會的討論。 也許更需要的是對科學家進行社會教育。 We want to hear what you think about this article. Submit a letter to the editor or write to letters@theatlantic.com.> 我們想聽聽你對這篇文章的看法。 向編輯提交一封信,或寫信至 letter@theatlantic.com。以上為調試內容,具體代碼也在上面
轉載于:https://www.cnblogs.com/JosonLee/p/10053718.html
總結
以上是生活随笔為你收集整理的(补充)爬取大西洋月刊并调用彩云小译翻译 API 脚本的全部內容,希望文章能夠幫你解決所遇到的問題。
- 上一篇: 次日转账最晚多久到账
- 下一篇: Linux之文档与目录结构