Special 321: Google I/O 2017        

TWiT Live Specials (Video-HD)

At this year's I/O developers conference keynote, Google highlighted several areas: their advances in AI, Google Assistant, YouTube, Android O, VR, and AR. New products and services announced include Google Lens, TensorFlow Lite, YouTube Super Chat, and stand-alone Daydream VR devices. Nathan Olivarez-Giles and Stacey Higginbotham analyze the announcements.

Hosts: Nathan Olivarez-Giles and Stacey Higginbotham

Download or subscribe to this show at https://twit.tv/shows/twit-live-specials.

Thanks to CacheFly for the bandwidth for this special presentation.


          JETSON TX2: Tensorflow-1.1.0のインストール        
JETSON TX2にtensorflow1.1.0をインストールしたときのメモ。JetsonHacksの”TensorFlow on NVIDIA Jetson TX2 Development Kit […]
          ä¸å®šæœŸML&NLPå ±#4        

最近の機械学習&自然言語処理に関する情報をまとめるコーナーです。前回はこちら。このエントリ忘れてるよというのがありましたら、たれこみフォームから教えてもらえるとうれしいです。

論文

  • [1701.07875] Wasserstein GAN
    • GANを含む生成系のタスクは難しいことが知られているが、学習時に使う距離をWasserstein距離というものを使うと学習が安定したという話

ブログ/勉強会資料

speakerdeck.com

機械学習モデルのバージョン管理をするstarchartというツールは知らなかった。解説記事もあった。

speakerdeck.com

機械学習というよりはログ基盤的な話。

AWS Athenaを使って 30TBのWeb行動ログ集計を試みた from Tetsutaro Watanabe
www.slideshare.net

ビジネス

学会/勉強会

NIPS読み会

Kaggle Tokyo Meetup #2

全脳アーキテクチャ若手の会

AAAI2017

その他

機械学習のための連続最適化 (機械学習プロフェッショナルシリーズ)

機械学習のための連続最適化 (機械学習プロフェッショナルシリーズ)

関係データ学習 (機械学習プロフェッショナルシリーズ)

関係データ学習 (機械学習プロフェッショナルシリーズ)


          ä¸å®šæœŸML&NLPå ±#1        

先日、社内で定期的に行なわれているフロントエンド会にお邪魔してきました(podcastが配信されています)。jser.infoというサイトを参照しながら雑談していたのですが、最近のフロントエンドの動向を知るという目的にはこのサイトなかなかよさそうでした。

機械学習勉強会でもランチタイムに最近の話題を見ながら雑談しているのですが、ネタになるエントリ一覧とそれに対するコメントは社外に公開して別に問題ないなと思ったので、不定期報という形で出してみることにしました。自然言語処理も自分がカバーできる範囲限られているし、自然言語処理以外の機械学習の話はかなりカバーできないので、たれこみフォームも作りました。耳寄りな情報、お待ちしております:)

論文

ブログ/勉強会資料

ビジネス

学会/勉強会

Coling2016

今年は大阪で開催。

NIPS2016

NL研(第229回自然言語処理研究会)

(2) [NLC] ゲーミフィケーションを利用した効率的な対話ログ収集の試み
○叶内 晨・小町 守(首都大東京)

データが集まると学習する方法はいくらでも出てきているので、データをどうやって効率よく集めていくのかというところに興味がある。

(5) [NL] 雑談対話システムの話題遷移における自然性の自動評価
○豊嶋章宏(NAIST)・杉山弘晃(NTT)・吉野幸一郎・中村 哲(NAIST)

(20) [NL] 14:30 – 15:00
単語分散表現を用いた単語アライメントによる日英機械翻訳の自動評価尺度
○松尾潤樹・小町 守(首都大)・須藤克仁(NTT)

データ収集と合わせて解析系でないタスクの評価は今後ホットなタスクになると思ってます。

(15) [NL] 17:25 – 17:55
単語分かち書き用辞書生成システム NEologd の運用 — 文書分類を例にして —
○佐藤敏紀・橋本泰一(LINE)・奥村 学(東工大)

最近各地で使われることが多くなってきたNEologdの話もあった。

言語処理学会2017

チュートリアルとテーマセッション、ワークショップの内容も出ていました。

クラウドソーシング
馬場 雪乃 先生(京都大学)
ニューラル機械翻訳
中澤 敏明 先生 (JST)
Universal Dependencies
金山 博 先生(日本IBM東京基礎研究所)
田中 貴秋 先生(NTTコミュニケーション科学基礎研究所)
認知言語学
西村 義樹 先生(東京大学)

ニューラル機械翻訳とUniversal Dependenciesが特に気になっている。

IM飲み2016

その他


          R’s way for Deep Learning with Keras        
Keras는 high level 딥러닝 API의 표준을 달리고 있는 딥러닝 프레임웍 중에 하나이다. TensorFlow를 기점으로 Theano, CNTK를 지원하고 있으며, 현재 MXNet까지 관련 인터페이스를 개발하고 있어 점점 딥러닝의 표준으로 자리잡고 있다. 필자는 Keras(or TensorFlow) + Python 기반으로 실무를 하고 있는데, 사실 딥러닝 프레임웍을 제외하고는 데이터를 다루는 모든면에서 R이 더 효과적이라고 생각하고 있는 사람중에 하나이고 많은 분들이 이 …
Read more
          R TensorFlow 코드 깃헙 공개        
딥러닝을 공부하는 가장 좋은 방법은 몇몇 대표적인 모형을 직접 코드로 작성해보고 모델링을 해보는 것이다. 최근에 많은 책들이 책을 출간하기 전에 코드를 깃헙에 공개하고 있는데, 이들 책 중에서 하나를 골라서 R로 코드를 구현하는 작업을 틈틈이 해왔다. 이 작업이 다소 번거로운건 Python코드와 R코드를 모두 잘 이해하고 구현해야 된다는 것이다. 다행히 Python에 대한 구현 경험이 있어서 큰 문제가 …
Read more
          ë”¥ëŸ¬ë‹ 머신 그리고 TensorFlow R word2vec 코드 구현/모델링        
TensorFlow Life 최근 TensorFlow를 팀에서 주로 사용하면서 이런저런 내부 프로젝트를 진행하고 있고, 과거에 보지 못했던 성과도 ë³¼ 수 있었으며, 이런 도구 사용과 경험을 통해 무엇보다 1년 전과는 문제를 바라보는 관점이 달라졌다는 것을 깊히 실감할 수 있었다. 이 때문에 거의 매일매일 새로운 경험을 하는 셈인데 그러면서 알고리즘에 대한 더 깊은 이해를 하게 되는거 같다. 과거 모든 …
Read more
          TensorFlow with R        
최근 Python이 데이터 분석 및 머신러닝에서 매우 좋은 도구로 인지되는 가장 중요한 역할을 한 부분은 딥러닝 기술을 리딩하고 있는 코어 랭귀지라는 측면이 가장 크다. ê·¸ 중심에는 TensorFlow가 있을 것이다. 필자의 경우 MXNet기반으로 몇몇 딥러닝 모형을 만들었고, 그중 몇몇은 실제 중요한 모델로서 역할을 잘 수행하고 있다. 물론 MXNet을 사용한 가장 중요한 이유는 R을 지원하는 몇 안되는 …
Read more
          Spring 2017 tech reading        
Hello and a belated happy new year to you! Here's another big list of articles I thought was worth sharing. As always thanks to the authors who wrote these articles and to the people who shared them on Twitter/HackerNews/etc.

Distributed systems (and even plain systems)

Tuning

SQL lateral view

Docker and containers

Science and math

Golang

Java streams and reactive systems

Java Lambdas

Just Java

General and/or fun

Until next time!

          Stuff The Internet Says On Scalability For July 7th, 2017        

Hey, it's HighScalability time:

 

 

What's real these days? I was at Lascaux II, an exact replica of Lascaux. I was deeply, deeply moved. Was this an authentic experience? A question we'll ask often in VR I think.

If you like this sort of Stuff then please support me on Patreon.
  • $400k: cost of yearly fake news campaign; $50,000: cost to discredit a journalist; 100 Gbps: SSDP DDoS amplification attack; $5.97BN: wild guess on cost of running Facebook on AWS; 2 billion: Facebook users; 80%: Spotify backend services in production run as containers; $60B: AR market by 2021; 10.4%: AMD market share taken from Intel; 5 days: MIT drone flight time; $1 trillion: Apple iOS revenues; 35%-144%: reduction in image sizes; 10 petabytes: Ancestry.com data stored; 1 trillion: photos taken on iPhone each year; $70B: Apple App Store payout to developers; 355: pages in Internet Trends 2017 report; 14: people needed to make 500,000 tons of steel; 25%: reduced server-rendering time with Node 8; 50-70%: of messages Gmail receives are spam; 8,000: bugs found in pacemaker code; 

  • Quotable Quotes:
    • Vladimir Putin: We must take into account the plans and directions of development of the armed forces of other countries… Our responses must be based on intellectual superiority, they will be asymmetric, and less expensive.
    • @swardley: What most fail to realise is that the Chinese corporate corpus has devoured western business thinking and gone beyond it.
    • @discostu105: I am a 10X developer. Everything I do takes ten times as long as I thought.
    • DINKDINK: You grossly underestimate the hashing capacity of the bitcoin network. The hashing capacity, at time of posting, is approximately 5,000,000,000 Gigahashes/second[1]. Spot measurement of the hashing capacity of an EC2 instance is 0.4 Gigahashes/second[2]. You would need 12 BILLION EC2 instances to 51% attack the bitcoin network.[3] Using EC2 to attack the network is impractical and inefficient.
    • danielsamuels && 19eightyfour~ Machiavelli's Guide to PaaS: Keep your friends close, and your competitors hosted.
    • Paul Buchheit:  I wrote the the first version of Gmail in a day!
    • @herminghaus: If you don’t care about latency, ship a 20ft intermodal container full of 32GB micro-SD cards across the globe. It’s a terabyte per second.
    • @cstross: Okay, so now the Russian defense industry is advertising war-in-a-can (multimodal freight containerized missiles):
    • Dennett~ you don't need comprehension to achieve competence.
    • @michellebrush~ Schema are APIs. @gwenshap #qconnyc
    • Stacy Mitchell: Amazon sells more clothing, electronics, toys, and books than any other company. Last year, Amazon captured nearly $1 of every $2 Americans spent online. As recently as 2015, most people looking to buy something online started at a search engine. Today, a majority go straight to Amazon.
    • Xcelerate: I have noticed that Azure does have a few powerful features that AWS and GCP lack, most notably InfiniBand (fast interconnects), which I have needed on more than one occasion for HPC tasks. In fact, 4x16 core instances on Azure are currently faster at performing molecular dynamics simulations than 1x"64 core" instance on GCP. But the cost is extremely high, and I still haven't found a good cloud platform for short, high intensity HPC tasks.
    • jjeaff: I took about 5 sites from a $50 a month shared cPanel plan that included a few WordPress blogs and some custom sites and put them on a $3 a month scaleway instance and haven't had a bit of trouble.
    • @discordianfish: GCP's Pub/Sub is really priced by GB? And 10GB/free/month? What's the catch?
    • Amazon: This moves beyond the current paradigm of typing search keywords in a box and navigating a website. Instead, discovery should be like talking with a friend who knows you, knows what you like, works with you at every step, and anticipates your needs. This is a vision where intelligence is everywhere. Every interaction should reflect who you are and what you like, and help you find what other people like you have already discovered. 
    • @CloudifySource: Lambda is always 100% busy - @adrianco #awasummit #telaviv #serverless
    • @codinghorror: Funny how Android sites have internalized this "only multi core scores now matter" narrative with 1/2 the CPU speed of iOS hardware
    • @sheeshee: deleted all home directories because no separation of "dev" & "production". almost ran a billion euro site into the ground with a bad loop.
    • We have quotes the likes of which even God has never seen! Please click through to ride all of them.

  • The Not Hotdog app on Silicon Valley may be a bit silly, but the story of how they built the real app is one of the best how-tos on building a machine learning app you'll ever read. How HBO’s Silicon Valley built “Not Hotdog” with mobile TensorFlow, Keras & React Native. The initial app was built in a weekend using Google Cloud Platform’s Vision API, and React Native. The final version took months of refinement. â€ŠGoogle Cloud’s Vision API was dropped because its accuracy in recognizing hotdogs was only so-so; it was slow because of the network hit; it cost too much. They ended up using Keras, a deep learning library that provides nicer, easier-to-use abstractions on top of TensorFlow. They used on SqueezeNet due to its explicit positioning as a solution for embedded deep learning. SqueezeNet used only 1.25 million parameters which made training much faster and reduced resource usage on the device. What would they change? timanglade: Honestly I think the biggest gains would be to go back to a beefier, pre-trained architecture like Inception, and see if I can quantize it to a size that’s manageable, especially if paired with CoreML on device. You’d get the accuracy that comes from big models, but in a package that runs well on mobile. And this is really cool: The last production trick we used was to leverage CodePush and Apple’s relatively permissive terms of service, to live-inject new versions of our neural networks after submission to the app store. 

  • And the winner is: all of us. Serverless Hosting Comparison: Lambda: Unicorn: $20,830.83. Heavy: $120.16. Medium: $4.55. Light: $0.00; Azure Functions: Unicorn: $19,993.60. Heavy: $115.40. Moderate: $3.60. Light: $0.00; Cloud Functions: Unicorn: $23,321.20. Heavy: $138.95. Moderate: $9.76. Light: $0.00; OpenWhisk: Unicorn: $21,243.20. Heavy: $120.70. Medium: $3.83. Light: $0.00; Fission.io: depends on the cost of running your managed Kubernetes cloud. 

  • Minds are algorithms made physical. Seeds May Use Tiny “Brains” to Decide When to Germinate: The seed has two hormones: abscisic acid (ABA), which sends the signal to stay dormant, and gibberellin (GA), which initiates germination. The push and pull between those two hormones helps the seed determine just the right time to start growing...According to Ghose, some 3,000 to 4,000 cells make up the Arabidopsis seeds...It turned out that the hormones clustered in two sections of cells near the tip of the seed—a region the researchers propose make up the “brain.” The two clumps of cells produce the hormones which they send as signals between each other. When ABA, produced by one clump, is the dominate hormone in this decision center, the seed stays dormant. But as GA increases, the “brain” begins telling the seed it’s time to sprout...This splitting of the command center helps the seed make more accurate decisions.

Don't miss all that the Internet has to say on Scalability, click below and become eventually consistent with all scalability knowledge (which means this post has many more items to read so please keep on reading)...


          100 announcements (!) from Google Cloud Next '17        

San Francisco — What a week! Google Cloud Next ‘17 has come to the end, but really, it’s just the beginning. We welcomed 10,000+ attendees including customers, partners, developers, IT leaders, engineers, press, analysts, cloud enthusiasts (and skeptics). Together we engaged in 3 days of keynotes, 200+ sessions, and 4 invitation-only summits. Hard to believe this was our first show as all of Google Cloud with GCP, G Suite, Chrome, Maps and Education. Thank you to all who were here with us in San Francisco this week, and we hope to see you next year.

If you’re a fan of video highlights, we’ve got you covered. Check out our Day 1 keynote (in less than 4 minutes) and Day 2 keynote (in under 5!).

One of the common refrains from customers and partners throughout the conference was “Wow, you’ve been busy. I can’t believe how many announcements you’ve had at Next!” So we decided to count all the announcements from across Google Cloud and in fact we had 100 (!) announcements this week.

For the list lovers amongst you, we’ve compiled a handy-dandy run-down of our announcements from the past few days:

100-announcements-15

Google Cloud is excited to welcome two new acquisitions to the Google Cloud family this week, Kaggle and AppBridge.

1. Kaggle - Kaggle is one of the world's largest communities of data scientists and machine learning enthusiasts. Kaggle and Google Cloud will continue to support machine learning training and deployment services in addition to offering the community the ability to store and query large datasets.

2. AppBridge - Google Cloud acquired Vancouver-based AppBridge this week, which helps you migrate data from on-prem file servers into G Suite and Google Drive.

100-announcements-4

Google Cloud brings a suite of new security features to Google Cloud Platform and G Suite designed to help safeguard your company’s assets and prevent disruption to your business: 

3. Identity-Aware Proxy (IAP) for Google Cloud Platform (Beta) - Identity-Aware Proxy lets you provide access to applications based on risk, rather than using a VPN. It provides secure application access from anywhere, restricts access by user, identity and group, deploys with integrated phishing resistant Security Key and is easier to setup than end-user VPN.

4. Data Loss Prevention (DLP) for Google Cloud Platform (Beta) - Data Loss Prevention API lets you scan data for 40+ sensitive data types, and is used as part of DLP in Gmail and Drive. You can find and redact sensitive data stored in GCP, invigorate old applications with new sensitive data sensing “smarts” and use predefined detectors as well as customize your own.

5. Key Management Service (KMS) for Google Cloud Platform (GA) - Key Management Service allows you to generate, use, rotate, and destroy symmetric encryption keys for use in the cloud.

6. Security Key Enforcement (SKE) for Google Cloud Platform (GA) - Security Key Enforcement allows you to require security keys be used as the 2-Step verification factor for enhanced anti-phishing security whenever a GCP application is accessed.

7. Vault for Google Drive (GA) - Google Vault is the eDiscovery and archiving solution for G Suite. Vault enables admins to easily manage their G Suite data lifecycle and search, preview and export the G Suite data in their domain. Vault for Drive enables full support for Google Drive content, including Team Drive files.

8. Google-designed security chip, Titan - Google uses Titan to establish hardware root of trust, allowing us to securely identify and authenticate legitimate access at the hardware level. Titan includes a hardware random number generator, performs cryptographic operations in the isolated memory, and has a dedicated secure processor (on-chip).

100-announcements-7

New GCP data analytics products and services help organizations solve business problems with data, rather than spending time and resources building, integrating and managing the underlying infrastructure:

9. BigQuery Data Transfer Service (Private Beta) - BigQuery Data Transfer Service makes it easy for users to quickly get value from all their Google-managed advertising datasets. With just a few clicks, marketing analysts can schedule data imports from Google Adwords, DoubleClick Campaign Manager, DoubleClick for Publishers and YouTube Content and Channel Owner reports.

10. Cloud Dataprep (Private Beta) - Cloud Dataprep is a new managed data service, built in collaboration with Trifacta, that makes it faster and easier for BigQuery end-users to visually explore and prepare data for analysis without the need for dedicated data engineer resources.

11. New Commercial Datasets - Businesses often look for datasets (public or commercial) outside their organizational boundaries. Commercial datasets offered include financial market data from Xignite, residential real-estate valuations (historical and projected) from HouseCanary, predictions for when a house will go on sale from Remine, historical weather data from AccuWeather, and news archives from Dow Jones, all immediately ready for use in BigQuery (with more to come as new partners join the program).

12. Python for Google Cloud Dataflow in GA - Cloud Dataflow is a fully managed data processing service supporting both batch and stream execution of pipelines. Until recently, these benefits have been available solely to Java developers. Now there’s a Python SDK for Cloud Dataflow in GA.

13. Stackdriver Monitoring for Cloud Dataflow (Beta) - We’ve integrated Cloud Dataflow with Stackdriver Monitoring so that you can access and analyze Cloud Dataflow job metrics and create alerts for specific Dataflow job conditions.

14. Google Cloud Datalab in GA - This interactive data science workflow tool makes it easy to do iterative model and data analysis in a Jupyter notebook-based environment using standard SQL, Python and shell commands.

15. Cloud Dataproc updates - Our fully managed service for running Apache Spark, Flink and Hadoop pipelines has new support for restarting failed jobs (including automatic restart as needed) in beta, the ability to create single-node clusters for lightweight sandbox development, in beta, GPU support, and the cloud labels feature, for more flexibility managing your Dataproc resources, is now GA.

100-announcements-9

New GCP databases and database features round out a platform on which developers can build great applications across a spectrum of use cases:

16. Cloud SQL for Postgre SQL (Beta) - Cloud SQL for PostgreSQL implements the same design principles currently reflected in Cloud SQL for MySQL, namely, the ability to securely store and connect to your relational data via open standards.

17. Microsoft SQL Server Enterprise (GA) - Available on Google Compute Engine, plus support for Windows Server Failover Clustering (WSFC) and SQL Server AlwaysOn Availability (GA).

18. Cloud SQL for MySQL improvements - Increased performance for demanding workloads via 32-core instances with up to 208GB of RAM, and central management of resources via Identity and Access Management (IAM) controls.

19. Cloud Spanner - Launched a month ago, but still, it would be remiss not to mention it because, hello, it’s Cloud Spanner! The industry’s first horizontally scalable, globally consistent, relational database service.

20. SSD persistent-disk performance improvements - SSD persistent disks now have increased throughput and IOPS performance, which are particularly beneficial for database and analytics workloads. Read these docs for complete details about persistent-disk performance.

21. Federated query on Cloud Bigtable - We’ve extended BigQuery’s reach to query data inside Cloud Bigtable, the NoSQL database service for massive analytic or operational workloads that require low latency and high throughput (particularly common in Financial Services and IoT use cases).

100-announcements-11

New GCP Cloud Machine Learning services bolster our efforts to make machine learning accessible to organizations of all sizes and sophistication:

22.  Cloud Machine Learning Engine (GA) - Cloud ML Engine, now generally available, is for organizations that want to train and deploy their own models into production in the cloud.

23. Cloud Video Intelligence API (Private Beta) - A first of its kind, Cloud Video Intelligence API lets developers easily search and discover video content by providing information about entities (nouns such as “dog,” “flower”, or “human” or verbs such as “run,” “swim,” or “fly”) inside video content.

24. Cloud Vision API (GA) - Cloud Vision API reaches GA and offers new capabilities for enterprises and partners to classify a more diverse set of images. The API can now recognize millions of entities from Google’s Knowledge Graph and offers enhanced OCR capabilities that can extract text from scans of text-heavy documents such as legal contracts or research papers or books.

25. Machine learning Advanced Solution Lab (ASL) - ASL provides dedicated facilities for our customers to directly collaborate with Google’s machine-learning experts to apply ML to their most pressing challenges.

26. Cloud Jobs API - A powerful aid to job search and discovery, Cloud Jobs API now has new features such as Commute Search, which will return relevant jobs based on desired commute time and preferred mode of transportation.

27. Machine Learning Startup Competition - We announced a Machine Learning Startup Competition in collaboration with venture capital firms Data Collective and Emergence Capital, and with additional support from a16z, Greylock Partners, GV, Kleiner Perkins Caufield & Byers and Sequoia Capital.

100-announcements-10

New GCP pricing continues our intention to create customer-friendly pricing that’s as smart as our products; and support services that are geared towards meeting our customers where they are:

28. Compute Engine price cuts - Continuing our history of pricing leadership, we’ve cut Google Compute Engine prices by up to 8%.

29. Committed Use Discounts - With Committed Use Discounts, customers can receive a discount of up to 57% off our list price, in exchange for a one or three year purchase commitment paid monthly, with no upfront costs.

30. Free trial extended to 12 months - We’ve extended our free trial from 60 days to 12 months, allowing you to use your $300 credit across all GCP services and APIs, at your own pace and schedule. Plus, we’re introduced new Always Free products -- non-expiring usage limits that you can use to test and develop applications at no cost. Visit the Google Cloud Platform Free Tier page for details.

31. Engineering Support - Our new Engineering Support offering is a role-based subscription model that allows us to match engineer to engineer, to meet you where your business is, no matter what stage of development you’re in. It has 3 tiers:

  • Development engineering support - ideal for developers or QA engineers that can manage with a response within four to eight business hours, priced at $100/user per month.
  • Production engineering support provides a one-hour response time for critical issues at $250/user per month.
  • On-call engineering support pages a Google engineer and delivers a 15-minute response time 24x7 for critical issues at $1,500/user per month.

32. Cloud.google.com/community site - Google Cloud Platform Community is a new site to learn, connect and share with other people like you, who are interested in GCP. You can follow along with tutorials or submit one yourself, find meetups in your area, and learn about community resources for GCP support, open source projects and more.

100-announcements-8

New GCP developer platforms and tools reinforce our commitment to openness and choice and giving you what you need to move fast and focus on great code.

33. Google AppEngine Flex (GA) - We announced a major expansion of our popular App Engine platform to new developer communities that emphasizes openness, developer choice, and application portability.

34. Cloud Functions (Beta) - Google Cloud Functions has launched into public beta. It is a serverless environment for creating event-driven applications and microservices, letting you build and connect cloud services with code.

35. Firebase integration with GCP (GA) - Firebase Storage is now Google Cloud Storage for Firebase and adds support for multiple buckets, support for linking to existing buckets, and integrates with Google Cloud Functions.

36. Cloud Container Builder - Cloud Container Builder is a standalone tool that lets you build your Docker containers on GCP regardless of deployment environment. It’s a fast, reliable, and consistent way to package your software into containers as part of an automated workflow.

37. Community Tutorials (Beta)  - With community tutorials, anyone can now submit or request a technical how-to for Google Cloud Platform.

100-announcements-9

Secure, global and high-performance, we’ve built our cloud for the long haul. This week we announced a slew of new infrastructure updates. 

38. New data center region: California - This new GCP region delivers lower latency for customers on the West Coast of the U.S. and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

39. New data center region: Montreal - This new GCP region delivers lower latency for customers in Canada and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

40. New data center region: Netherlands - This new GCP region delivers lower latency for customers in Western Europe and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

41. Google Container Engine - Managed Nodes - Google Container Engine (GKE) has added Automated Monitoring and Repair of your GKE nodes, letting you focus on your applications while Google ensures your cluster is available and up-to-date.

42. 64 Core machines + more memory - We have doubled the number of vCPUs you can run in an instance from 32 to 64 and up to 416GB of memory per instance.

43. Internal Load balancing (GA) - Internal Load Balancing, now GA, lets you run and scale your services behind a private load balancing IP address which is accessible only to your internal instances, not the internet.

44. Cross-Project Networking (Beta) - Cross-Project Networking (XPN), now in beta, is a virtual network that provides a common network across several Google Cloud Platform projects, enabling simple multi-tenant deployments.

100-announcements-16

In the past year, we’ve launched 300+ features and updates for G Suite and this week we announced our next generation of collaboration and communication tools.

45. Team Drives (GA for G Suite Business, Education and Enterprise customers) - Team Drives help teams simply and securely manage permissions, ownership and file access for an organization within Google Drive.

46. Drive File Stream (EAP) - Drive File Stream is a way to quickly stream files directly from the cloud to your computer With Drive File Steam, company data can be accessed directly from your laptop, even if you don’t have much space on your hard drive.

47. Google Vault for Drive (GA for G Suite Business, Education and Enterprise customers) - Google Vault for Drive now gives admins the governance controls they need to manage and secure all of their files, including employee Drives and Team Drives. Google Vault for Drive also lets admins set retention policies that automatically keep what’s needed and delete what’s not.

48. Quick Access in Team Drives (GA) - powered by Google’s machine intelligence, Quick Access helps to surface the right information for employees at the right time within Google Drive. Quick Access now works with Team Drives on iOS and Android devices, and is coming soon to the web.

49. Hangouts Meet (GA to existing customers) - Hangouts Meet is a new video meeting experience built on the Hangouts that can run 30-person video conferences without accounts, plugins or downloads. For G Suite Enterprise customers, each call comes with a dedicated dial-in phone number so that team members on the road can join meetings without wifi or data issues.

50. Hangouts Chat (EAP) - Hangouts Chat is an intelligent communication app in Hangouts with dedicated, virtual rooms that connect cross-functional enterprise teams. Hangouts Chat integrates with G Suite apps like Drive and Docs, as well as photos, videos and other third-party enterprise apps.

51. @meet - @meet is an intelligent bot built on top of the Hangouts platform that uses natural language processing and machine learning to automatically schedule meetings for your team with Hangouts Meet and Google Calendar.

52. Gmail Add-ons for G Suite (Developer Preview) - Gmail Add-ons provide a way to surface the functionality of your app or service directly in Gmail. With Add-ons, developers only build their integration once, and it runs natively in Gmail on web, Android and iOS.

53. Edit Opportunities in Google Sheets - with Edit Opportunities in Google Sheets, sales reps can sync a Salesforce Opportunity List View to Sheets to bulk edit data and changes are synced automatically to Salesforce, no upload required.

54. Jamboard - Our whiteboard in the cloud goes GA in May! Jamboard merges the worlds of physical and digital creativity. It’s real time collaboration on a brilliant scale, whether your team is together in the conference room or spread all over the world.

100-announcements-17

Building on the momentum from a growing number of businesses using Chrome digital signage and kiosks, we added new management tools and APIs in addition to introducing support for Android Kiosk apps on supported Chrome devices. 

55. Android Kiosk Apps for Chrome - Android Kiosk for Chrome lets users manage and deploy Chrome digital signage and kiosks for both web and Android apps. And with Public Session Kiosks, IT admins can now add a number of Chrome packaged apps alongside hosted apps.

56. Chrome Kiosk Management Free trial - This free trial gives customers an easy way to test out Chrome for signage and kiosk deployments.

57. Chrome Device Management (CDM) APIs for Kiosks - These APIs offer programmatic access to various Kiosk policies. IT admins can schedule a device reboot through the new APIs and integrate that functionality directly in a third- party console.

58. Chrome Stability API - This new API allows Kiosk app developers to improve the reliability of the application and the system.

100-announcements-2

Attendees at Google Cloud Next ‘17 heard stories from many of our valued customers:

59. Colgate - Colgate-Palmolive partnered with Google Cloud and SAP to bring thousands of employees together through G Suite collaboration and productivity tools. The company deployed G Suite to 28,000 employees in less than six months.

60. Disney Consumer Products & Interactive (DCPI) - DCPI is on target to migrate out of its legacy infrastructure this year, and is leveraging machine learning to power next generation guest experiences.

61. eBay - eBay uses Google Cloud technologies including Google Container Engine, Machine Learning and AI for its ShopBot, a personal shopping bot on Facebook Messenger.

62. HSBC - HSBC is one of the world's largest financial and banking institutions and making a large investment in transforming its global IT. The company is working closely with Google to deploy Cloud DataFlow, BigQuery and other data services to power critical proof of concept projects.

63. LUSH - LUSH migrated its global e-commerce site from AWS to GCP in less than six weeks, significantly improving the reliability and stability of its site. LUSH benefits from GCP’s ability to scale as transaction volume surges, which is critical for a retail business. In addition, Google's commitment to renewable energy sources aligns with LUSH's ethical principles.

64. Oden Technologies - Oden was part of Google Cloud’s startup program, and switched its entire platform to GCP from AWS. GCP offers Oden the ability to reliably scale while keeping costs low, perform under heavy loads and consistently delivers sophisticated features including machine learning and data analytics.

65. Planet - Planet migrated to GCP in February, looking to accelerate their workloads and leverage Google Cloud for several key advantages: price stability and predictability, custom instances, first-class Kubernetes support, and Machine Learning technology. Planet also announced the beta release of their Explorer platform.

66. Schlumberger - Schlumberger is making a critical investment in the cloud, turning to GCP to enable high-performance computing, remote visualization and development velocity. GCP is helping Schlumberger deliver innovative products and services to its customers by using HPC to scale data processing, workflow and advanced algorithms.

67. The Home Depot - The Home Depot collaborated with GCP’s Customer Reliability Engineering team to migrate HomeDepot.com to the cloud in time for Black Friday and Cyber Monday. Moving to GCP has allowed the company to better manage huge traffic spikes at peak shopping times throughout the year.

68. Verizon - Verizon is deploying G Suite to more than 150,000 of its employees, allowing for collaboration and flexibility in the workplace while maintaining security and compliance standards. Verizon and Google Cloud have been working together for more than a year to bring simple and secure productivity solutions to Verizon’s workforce.

100-announcements-3

We brought together Google Cloud partners from our growing ecosystem across G Suite, GCP, Maps, Devices and Education. Our partnering philosophy is driven by a set of principles that emphasize openness, innovation, fairness, transparency and shared success in the cloud market. Here are some of our partners who were out in force at the show:

69. Accenture - Accenture announced that it has designed a mobility solution for Rentokil, a global pest control company, built in collaboration with Google as part of the partnership announced at Horizon in September.

70. Alooma - Alooma announced the integration of the Alooma service with Google Cloud SQL and BigQuery.

71. Authorized Training Partner Program - To help companies scale their training offerings more quickly, and to enable Google to add other training partners to the ecosystem, we are introducing a new track within our partner program to support their unique offerings and needs.

72. Check Point - Check Point® Software Technologies announced Check Point vSEC for Google Cloud Platform, delivering advanced security integrated with GCP as well as their joining of the Google Cloud Technology Partner Program.

73. CloudEndure - We’re collaborating with CloudEndure to offer a no cost, self-service migration tool for Google Cloud Platform (GCP) customers.

74. Coursera - Coursera announced that it is collaborating with Google Cloud Platform to provide an extensive range of Google Cloud training course. To celebrate this announcement  Coursera is offering all NEXT attendees a 100% discount for the GCP fundamentals class.

75. DocuSign - DocuSign announced deeper integrations with Google Docs.

76. Egnyte - Egnyte announced an enhanced integration with Google Docs that will allow our joint customers to create, edit, and store Google Docs, Sheets and Slides files right from within the Egnyte Connect.

77. Google Cloud Global Partner Awards - We recognized 12 Google Cloud partners that demonstrated strong customer success and solution innovation over the past year: Accenture, Pivotal, LumApps, Slack, Looker, Palo Alto Networks, Virtru, SoftBank, DoIT, Snowdrop Solutions, CDW Corporation, and SYNNEX Corporation.

78. iCharts - iCharts announced additional support for several GCP databases, free pivot tables for current Google BigQuery users, and a new product dubbed “iCharts for SaaS.”

79. Intel - In addition to the progress with Skylake, Intel and Google Cloud launched several technology initiatives and market education efforts covering IoT, Kubernetes and TensorFlow, including optimizations, a developer program and tool kits.

80. Intuit - Intuit announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

81. Liftigniter - Liftigniter is a member of Google Cloud’s startup program and focused on machine learning personalization using predictive analytics to improve CTR on web and in-app.

82. Looker - Looker launched a suite of Looker Blocks, compatible with Google BigQuery Data Transfer Service, designed to give marketers the tools to enhance analysis of their critical data.

83. Low interest loans for partners - To help Premier Partners grow their teams, Google announced that capital investment are available to qualified partners in the form of low interest loans.

84. MicroStrategy - MicroStrategy announced an integration with Google Cloud SQL for PostgreSQL and Google Cloud SQL for MySQL.

85. New incentives to accelerate partner growth - We are increasing our investments in multiple existing and new incentive programs; including, low interest loans to help Premier Partners grow their teams, increasing co-funding to accelerate deals, and expanding our rebate programs.

86. Orbitera Test Drives for GCP Partners - Test Drives allow customers to try partners’ software and generate high quality leads that can be passed directly to the partners’ sales teams. Google is offering Premier Cloud Partners one year of free Test Drives on Orbitera.

87. Partner specializations - Partners demonstrating strong customer success and technical proficiency in certain solution areas will now qualify to apply for a specialization. We’re launching specializations in application development, data analytics, machine learning and infrastructure.

88. Pivotal - GCP announced Pivotal as our first CRE technology partner. CRE technology partners will work hand-in-hand with Google to thoroughly review their solutions and implement changes to address identified risks to reliability.

89. ProsperWorks - ProsperWorks announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

90. Qwiklabs - This recent acquisition will provide Authorized Training Partners the ability to offer hands-on labs and comprehensive courses developed by Google experts to our customers.

91. Rackspace - Rackspace announced a strategic relationship with Google Cloud to become its first managed services support partner for GCP, with plans to collaborate on a new managed services offering for GCP customers set to launch later this year.

92. Rocket.Chat - Rocket.Chat, a member of Google Cloud’s startup program, is adding a number of new product integrations with GCP including Autotranslate via Translate API, integration with Vision API to screen for inappropriate content, integration to NLP API to perform sentiment analysis on public channels, integration with GSuite for authentication and a full move of back-end storage to Google Cloud Storage.

93. Salesforce - Salesforce announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

94. SAP - This strategic partnership includes certification of SAP HANA on GCP, new G Suite integrations and future collaboration on building machine learning features into intelligent applications like conversational apps that guide users through complex workflows and transactions.

95. Smyte - Smyte participated in the Google Cloud startup program and protects millions of actions a day on websites and mobile applications. Smyte recently moved from self-hosted Kubernetes to Google Container Engine (GKE).

96. Veritas - Veritas expanded its partnership with Google Cloud to provide joint customers with 360 Data Management capabilities. The partnership will help reduce data storage costs, increase compliance and eDiscovery readiness and accelerate the customer’s journey to Google Cloud Platform.

97. VMware Airwatch - Airwatch provides enterprise mobility management solutions for Android and continues to drive the Google Device ecosystem to enterprise customers.

98. Windows Partner Program- We’re working with top systems integrators in the Windows community to help GCP customers take full advantage of Windows and .NET apps and services on our platform.

99. Xplenty - Xplenty announced the addition of two new services from Google Cloud into their available integrations: Google Cloud Spanner and Google Cloud SQL for PostgreSQL.

100. Zoomdata - Zoomdata announced support for Google’s Cloud Spanner and PostgreSQL on GCP, as well as enhancements to the existing Zoomdata Smart Connector for Google BigQuery. With these new capabilities Zoomdata offers deeply integrated and optimized support for Google Cloud Platform’s Cloud Spanner, PostgreSQL, Google BigQuery, and Cloud DataProc services.

We’re thrilled to have so many new products and partners that can help all of our customers grow. And as our final announcement for Google Cloud Next ’17 — please save the date for Next 2018: June 4–6 in San Francisco.

I guess that makes it 101. :-)



           Solving ill-posed inverse problems using iterative deep neural networks / Jobs: 2 Postdocs @ KTH, Sweden - implementation -        
Ozan just sent me the following e-mail. It has the right mix of elements of The Great Convergence by applying learning to learn methods to inverse problems that are some of the problems we thought compressive sensing could solve well (CT tomography), papers suporting those results, an implementation, a blog entry and two postdoc jobs. Awesome !
Dear Igor,


I have for some time followed your excellent blog Nuit Blanche. I'm not familiar with how you select entries for Nuit Blanche, but let me take the opportunity to provide potential input for related to Nuit Blanche on the exciting research we pursue at the Department of Mathematics, KTH Royal Institute of Technology. If you find any of this interesting, please feel free to post it on Nuit Blanche.


1. Deep learning and tomographic image reconstruction
The main objective for the research is to develop theory and algorithms for 3D tomographic reconstruction. An important recent development has been to use techniques from deep learning to solve inverse problems. We have developed a rather generic, yet adaptable, framework that combines elements of variational regularization with machine learning for solving large scale inverse problems. More precisely, the idea is to learn a reconstruction scheme by making use of the forward operator, noise model and other a priori information. This goes beyond learning a denoiser where one first performs an initial (non machine-learning) reconstruction and then uses machine learning on the resulting image-to-image (denoising) problem. Several groups have done learning a denoiser and the results are in fact quite remarkable, outperforming previous state of the art methods. Our approach however combines reconstruction and denoising steps which further improves the results. The following two arXiv-reports http://arxiv.org/abs/1707.06474 and http://arxiv.org/abs/1704.04058 provide more details, there is also a blog-post at http://adler-j.github.io/2017/07/21/Learning-to-reconstruct.html by one of our PhD students that explains this idea of "learning to reconstruct".


2. Post doctoral fellowships
I'm looking for two 2-year post-doctoral fellowships, one dealing with regularization of spatiotemporal and/or multichannel images and the other with methods for combining elements of variational regularization with deep learning for solving inverse problems. The announcements are given below. I would be glad if you could post these also on your blog.


Postdoctoral fellow in PET/SPECT Image Reconstruction (S-2017-1166)
Deadline: December 1, 2017
Brief description:
The position includes research & development of algorithms for PET and SPECT image reconstruction. Work is closely related to on-going research on (a) multi-channel regularization for PET/CT and SPECT/CT imaging, (b) joint reconstruction and image matching for spatio-temporal pulmonary PET/CT and cardiac SPECT/CT imaging, and (c) task-based reconstruction by iterative deep neural networks. An important part is to integrate routines for forward and backprojection from reconstruction packages like STIR and EMrecon for PET and NiftyRec for SPECT with ODL (http://github.com/odlgroup/odl), our Python based framework for reconstruction. Part of the research may include industrial (Elekta and Philips Healthcare) and clinical (Karolinska University Hospital) collaboration.
Announcement & instructions:
http://www.kth.se/en/om/work-at-kth/lediga-jobb/what:job/jobID:158920/type:job/where:4/apply:1

Postdoctoral fellow in Image Reconstruction/Deep Dictionary Learning (S-2017-1165)
Deadline: December 1, 2017
Brief description:

The position includes research & development of theory and algorithms that combine methods from machine learning with sparse signal processing for joint dictionary design and image reconstruction in tomography. A key element is to design dictionaries that not only yield sparse representation, but also contain discriminative information. Methods will be implemented in ODL (http://github.com/odlgroup/odl), our Python based framework for reconstruction which enables one to utilize the existing integration between ODL and TensorFlow. The research is part of a larger effort that aims to combine elements of variational regularization with machine learning for solving large scale inverse problems, see the arXiv-reports http://arxiv.org/abs/1707.06474 and http://arxiv.org/abs/1704.04058 and the blog-post at http://adler-j.github.io/2017/07/21/Learning-to-reconstruct.html for further details. Part of the research may include industrial (Elekta and Philips Healthcare) and clinical (Karolinska University Hospital) collaboration.Announcement & instructions:
http://www.kth.se/en/om/work-at-kth/lediga-jobb/what:job/jobID:158923/type:job/where:4/apply:1




Best regards,
Ozan


--

Assoc. Prof. Ozan Öktem
Director, KTH Life Science Technology Platform
Web: http://ww.kth.se/lifescience


Department of Matematics
KTH Royal Institute of Technology
SE-100 44 Stockholm, Sweden
E-mail: ozan@kth.se




Learned Primal-dual Reconstruction by Jonas Adler, Ozan Öktem

We propose a Learned Primal-Dual algorithm for tomographic reconstruction. The algorithm includes the (possibly non-linear) forward operator in a deep neural network inspired by unrolled proximal primal-dual optimization methods, but where the proximal operators have been replaced with convolutional neural networks. The algorithm is trained end-to-end, working directly from raw measured data and does not depend on any initial reconstruction such as FBP.
We evaluate the algorithm on low dose CT reconstruction using both analytic and human phantoms against classical reconstruction given by FBP and TV regularized reconstruction as well as deep learning based post-processing of a FBP reconstruction.
For the analytic data we demonstrate PSNR improvements of >10 dB when compared to both TV reconstruction and learned post-processing. For the human phantom we demonstrate a 6.6 dB improvement compared to TV and a 2.2 dB improvement as compared to learned post-processing. The proposed algorithm also improves upon the compared algorithms with respect to the SSIM and the evaluation time is approximately 600 ms for a 512 x 512 pixel dataset.  

Solving ill-posed inverse problems using iterative deep neural networks by Jonas Adler, Ozan Öktem
We propose a partially learned approach for the solution of ill posed inverse problems with not necessarily linear forward operators. The method builds on ideas from classical regularization theory and recent advances in deep learning to perform learning while making use of prior information about the inverse problem encoded in the forward operator, noise model and a regularizing functional. The method results in a gradient-like iterative scheme, where the "gradient" component is learned using a convolutional network that includes the gradients of the data discrepancy and regularizer as input in each iteration. We present results of such a partially learned gradient scheme on a non-linear tomographic inversion problem with simulated data from both the Sheep-Logan phantom as well as a head CT. The outcome is compared against FBP and TV reconstruction and the proposed method provides a 5.4 dB PSNR improvement over the TV reconstruction while being significantly faster, giving reconstructions of 512 x 512 volumes in about 0.4 seconds using a single GPU.
An implementation is here: https://github.com/adler-j/learned_gradient_tomography
 
 
Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          Densely Connected Convolutional Networks - implementations -        

Densely Connected Convolutional Networks by Gao Huang, Zhuang Liu, Kilian Q. Weinberger, Laurens van der Maaten

Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and models are available at this https URL .


From the main implementation page at: https://github.com/liuzhuang13/DenseNet

"..Other Implementations
  1. Our Caffe Implementation
  2. Our (much more) space-efficient Caffe Implementation.
  3. PyTorch Implementation (with BC structure) by Andreas Veit.
  4. PyTorch Implementation (with BC structure) by Brandon Amos.
  5. MXNet Implementation by Nicatio.
  6. MXNet Implementation (supporting ImageNet) by Xiong Lin.
  7. Tensorflow Implementation by Yixuan Li.
  8. Tensorflow Implementation by Laurent Mazare.
  9. Tensorflow Implementation (with BC structure) by Illarion Khlestov.
  10. Lasagne Implementation by Jan Schlüter.
  11. Keras Implementation by tdeboissiere.
  12. Keras Implementation by Roberto de Moura Estevão Filho.
  13. Keras Implementation (with BC structure) by Somshubra Majumdar.
  14. Chainer Implementation by Toshinori Hanya.
  15. Chainer Implementation by Yasunori Kudo.
  16. Fully Convolutional DenseNets for segmentation by Simon Jegou...."


Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          Slides: Deep Learning and Reinforcement Learning Summer School 2017 @ MILA Montreal, Canada        
The Deep Learning and Reinforcement Learning Summer School 2017 just finished and here are some of the slides presented there (videos should be coming later) 



Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

          Google AI could keep baby food safe        

Google's artificial intelligence technology can help the food industry beyond picking better cucumbers. In one company's case, it could prevent your child from getting sick. Japanese food producer Kewpie Corporation has revealed that it's using Google's TensorFlow to quickly inspect ingredients, including the diced potatoes it uses in baby food. The firm and its partner BrainPad trained the machine learning system to recognize good ingredients by feeding it 18,000 photos, and set it to work looking for visual 'anomalies' that hint at sub-par potatoes. The result was an inspection system with "near-perfect" accuracy, culling more defective ingredients than humans alone -- even with a conveyor belt shuttling potatoes along at high speed.

Source: Google


          IBM speeds deep learning by using multiple servers        

For everyone frustrated by how long it takes to train deep learning models, IBM has some good news: It has unveiled a way to automatically split deep-learning training jobs across multiple physical servers -- not just individual GPUs, but whole systems with their own separate sets of GPUs.

Now the bad news: It's available only in IBM's PowerAI 4.0 software package, which runs exclusively on IBM's own OpenPower hardware systems.

Distributed Deep Learning (DDL) doesn't require developers to learn an entirely new deep learning framework. It repackages several common frameworks for machine learning: TensorFlow, Torch, Caffe, Chainer, and Theano. Deep learning projecs that use those frameworks can then run in parallel across multiple hardware nodes.

To read this article in full or to leave a comment, please click here


          Google Announces Tensor2Tensor for TensorFlow        

Google Brain team open-sourced Tensor2Tensor, a set of utilities and wrappers for modularizing TensorFlow workflow components to create a more portable, and repeatable environment for TensorFlow-based deep neural network programs.

By Dylan Raithel
          RR 319 Machine Learning with Tyler Renelle        

RR 319 Machine Learning with Tyler Renelle

This episode of the Ruby Rogues Panel features panelists Charles Max Wood and Dave Kimura. Tyler Renelle, who stops by to talk about machine learning, joins them as a guest. Tyler is the first guest to talk on Adventures in Angular, JavaScript Jabber, and Ruby Rogues. Tune in to find out more about Tyler and machine learning!

What is machine learning?

Machine learning is a different concept than programmers are used to.

There are three phases in computing technology.

  • First phase – building computers in the first place but it was hard coded onto the physical computing machinery
  • Second phase – programmable computers. Where you can reprogram your computer to do anything. This is the phase where programmers fall.
  • Third phase – machine learning falls under this phase.

Machine learning is where the computer programs itself to do something. You give the computer a measurement of how it’s doing based on data and it trains itself and learns how to do the task. It is beginning to get a lot of press and become more popular. This is because it is becoming a lot more capable by way of deep learning.

AI – Artificial Intelligence

Machine learning is a sub field of artificial intelligence. AI is an overarching field of the computer simulating intelligence. Machine learning has become less and less a sub field over time and more a majority of AI. Now we can apply machine learning to vision, speech processing, planning, knowledge representation. This is fast taking over AI. People are beginning to consider the terms artificial intelligence and machine learning synonymous.

Self-driving cars are a type of artificial intelligence. The connection between machine learning and self-driving cars is abstract. A fundamental thing in self-driving cars is machine learning. You program the car as to how to fix its mistakes. Another example is facial recognition. The program starts learning about a person’s face over time so it can make an educated guess as to if the person is who they say they are. Once statistics are added then your face can be off by a hair or a hat. Small variations won’t throw it off.

How do we start solving the problems we want to be solved?

Machine learning has been applied since the 1950s to a broad spectrum of problems. Have to have a little bit of domain knowledge and do some research.

Machine Learning Vs Programming

Machine learning is any sort of fuzzy programming situation. Programming is when you do things specifically or statically.

Why should you care to do machine learning?

People should care because this is the next wave of computing. There is a theory that this will displace jobs. Self-driving cars will displace truck drivers, Uber drivers, and taxis. There are things like logo generators already. Machines are generating music, poetry, and website designs. We shouldn’t be afraid that we should keep an eye towards it.

If a robot or computer program or AI were able to write its own code, at what point would it be able to overwrite or basically nullify the three laws of robotics?

Nick Bostrom wrote the book Superintelligence, which had many big names in technology talking about the dangers of AI. Artificial intelligence has been talked about widely because of the possibility of evil killer robots in the Sci-Fi community. There are people who hold very potential concerns, such as job automation.

Consciousness is a huge topic of debate right now on this topic. Is it an emergent property of the human brain? Is what we have with deep learning enough of a representation to achieve consciousness? It is suggested that AI may or may not achieve consciousness. The question is if it is able to achieve consciousness - will we be able to tell there isn’t a person there?

If people want to dive into this where do they go?

Machine Learning Language

The main language used for machine learning is Python. This is not because of the language itself, but because of the tools built on top of it. The main framework is TensorFlow. Python in TensorFlow drops to C and executes code on the GPU for performing matrix algebra, which is essential for deep learning. You can always use C, C++, Java, and R. Data scientists mostly use R, while researchers use C and C++ so they can custom code their matrix algebra themselves.

Picks

Dave:

Charles:

Tyler:


          TensorFlow working on an Ubuntu virtual Machine        
  This is a quick step-by-step on how to get TensorFlow working on an Ubuntu virtual Machine. The initial steps are very high level because it not hard to find tutorials or documentation that supports them. 1) Download and install Oracle Virtual Box from https://www.virtualbox.org/wiki/Downloads   2) Download an Ubuntu image from http://www.ubuntu.com/download 3) Create […]
          What does Google’s tensorflow mean for AI?        


Google’s release of their tensorflow machine learning library has attracted a lot of attention recently.   Like everyone else in the field I’ve felt moved to take a look.

(Microsoft's recent release of an open source distributed machine learning toolkit is also interesting.   But that would be another story; here I'll restrict myself to tensorflow...)

tensorflow as a Deep Machine Learning Toolkit


Folks familiar with tools for deep learning based machine vision will quickly see that the
tensorflow neural net library is fairly similar to in concept to the Theano/pylearn2 library from Yoshua Bengio’s team at U. Montreal.   Its functionality is similar to Theano/pylearn2 and also to other modern deep ML toolkits like Caffe.   However, it looks like it may combine the strengths of the different existing toolkits in a novel way — an elegant,simple to use architecture like Theano/pylearn2, combined with rapid execution like one gets with Caffe.

Tensorflow is an infrastructure and toolkit, intended so that one can build and run specific deep learning algorithms within it.  The specific algorithms released with the toolkit initially are well-known and fairly limited.   For instance, they give a 2D convolutional neural net but not a 3D one (though Facebook open-sourced a 3D CNN not long ago).

The currently released version of tensorflow runs on one machine only (though making efficient use of multiple processors).  But it seems they may release a distributed version some time fairly soon 

tensorflow as a Dataflow Framework


As well as a toolkit for implementing distributed deep learning algorithms, tensorflow is also — underneath — a fairly general framework for “dataflow”, for passing knowledge around among graphs.   However, looked at as a dataflow architecture it has some fairly strict limitations, which emerge directly from its purpose as an infrastructure for current deep learning neural net algorithms.

For one thing, tensorflow seems optimized for passing around pretty large chunks of data ....  So if one wanted to use it to spread activation around in a network, one wouldn't make an Operation per neuron, rather one would make an "activation-spreading" Operation and have it act on a connection matrix or similar....

Furthermore, tensorflow’s execution model seems to be fundamentally *synchronous*.  Even when run across multiple machines in distributed mode using Senders and Receivers, the basic mathematical operation of the network is synchronous.  This is fine for most current
deep learning algorithms, which are constructed of nodes that are assumed to pass information around among each other in a specific and synchronized way.  The control mechanisms tensorflow provides (e.g. for and while constructs) are flowchart-like rather than adaptive-network-like, and remain within the synchronized execution paradigm, so far as I can tell.

This is a marked contrast to ROS, which my team at OpenCog and Hanson Robotics is currently using for robotics work — in ROS one wraps up different functions in ROS nodes, which interact with each other autonomously and asynchronously.  It’s also a contrast to the BrICA framework for AGI and brain emulation produced recently by the Japanese Whole Brain Initiative.   BriCA’s nodes pass around vectors rather than tensors, but since a tensor is basically a multidimensional stack of vectors, this amounts to the same thing.  BrICA’s nodes interact asychronously via a simple but elegant mechanism.   This reflects the fact that BrICA was engineered as a framework for neural net based AGI, whereas tensorflow was engineered as a framework for a valuable but relatively narrow class of deep learning based data processing algorithms.

That is: Conceptually, it seems that tensorflow is made for executing precisely-orchestrated multi-node algorithms (potentially in a distributed way), in which interaction among nodes happens in a specifically synchronized and predetermined way based on a particular architecture; whereas BriCA can also be applied to more open ended designs in which different nodes (components) react to each others' outputs on the fly and everything does not happen within an overall architecture in which the dynamic relations between the behaviors of the components are thought out.  Philosophically this related to the more "open-ended" nature of AGI systems.

 tensorflow and OpenCog?


My current view on the currently popular deep learning architectures for data processing (whose implementation and tweaking and application tensorflow is intended to ease) is that they are strong for perceptual pattern recognition, but do not constitute general-purpose cognitive architectures for general intelligence.

Contrasting tensorflow and OpenCog (which is worse by far than contrasting apples and oranges, but so be it…), one observation we can make is that an OpenCog Atom is a persistent store of information, whereas a TensorFlow graph is a collection of Operations (each translating input into output).  So, on the face of it, TensorFlow is best for (certain sorts of) procedural knowledge, whereas Atomspace is best for declarative knowledge....   It seems the "declarative knowledge" in a TensorFlow graph is pretty much contained in the numerical tensors that the Operations pass around...

In OpenCog’s MOSES component, small LISP-like programs called “Combo trees” are used to represent certain sorts of procedural knowledge; these are then mapped into the Atomspace for declarative analysis.  But deep learning neural nets are most suitable for representing different sorts of procedural knowledge than Combo trees — e.g. procedural knowledge used for low-level perception and action.  (The distinction between procedural and sensorimotor knowledge blurs a bit here, but that would be a topic for another blog post….)

I had been thinking about integrating deep learning based perception into OpenCog using Theano / pylearn2 as an underlying engine — making OpenCog Atoms that executed small neural networks on GPU, and using the OpenCog Atomspace to glue together these small neural networks (via the Atoms that refer to them) into an overall architecture.  See particulars here and here.

Now I am wondering whether we should do this using tensorflow instead, or as well….

In terms of OpenCog/tensorflow integration, the most straightforward thing would be to implement


  • TensorNode ... with subtypes as appropriate
  • GroundedSchemaNodes that wrap up TensorFlow "Operations"


This would allow us to basically embed TensorFlow graphs inside the Atomspace...

Deep learning operations like convolution are represented as opaque operations in tensorflow, and would also be opaque operations (wrapped inside GSNs) in OpenCog....

The purported advantage over Theano would be that TensorFlow is supposed to be faster (we'll test), whereas Theano has an elegant interface but is slower than Caffe ...

Wrapping Operations inside GSN would add a level of indirection/inefficiency, but if the Operations are expensive things like running convolutions on images or multiplying big matrices, this doesn't matter much...

Anyway, we will evaluate and see what makes sense! …

Rambling Reflections on the Open-Source Ecosystem


The AI / proto-AGI landscape is certainly becoming interesting and complex these days.  It seems that AI went in just a few years from being obscure and marginalized (outside of science fiction) to being big-time corporate.  Which is exciting in terms of the R&D progress it will likely lead to, yet frustrating to those of us who aren’t thrilled with the domination of the world socioeconomy by megacorporations.

But then we also see a major trend of big companies sharing significant aspects of their AI code with the world at large via open-source released like Facebook’s conv3D code and Google’s tensorflow, and so many others.   They are doing this for multiple reasons — one is that it keeps their research staff happy (most researchers want to feel they’re contributing to the scientific community at large rather than just to one company); and another is that other researchers, learning from and improving on the code they have released, will create new innovations they can use.  The interplay between the free-and-open R&D world and the corporate-and-proprietary R&D world becomes subtler and subtler.

Supposing we integrate tensorflow into OpenCog and it yield interesting results… Google could then choose to use OpenCog themselves and integrate it into their own systems.  Hopefully if they did so, they would push some of their OpenCog improvements into the open-source ecosystem as well.  Precisely where this sort of thing will lead business-wise is not entirely clear, given the shifting nature of current tech business models, but it’s already clear that companies like Google don’t derive the bulk of their business advantage from proprietary algorithms or code, but rather from the social dynamics associated with their products and their brand.

If open-source AI code were somehow coupled with a shift in the dynamics of online interaction, to something more peer-to-peer and less big-media and big-company and advertising dominated — THEN we would have a more dramatic shift, with interesting implications for everybody’s business model.  But that’s another topic that would lead us far afield from tensorflow.  For the time being, it seems that the open-source ecosystem is playing a fairly core role in the complex unfolding of AI algorithms, architectures and applications among various intellectual/socieconomic actors … and funky stuff like tensorflow is emerging as a result.


          â€œì¸ê³µì§€ëŠ¥ 음악, 타이밍·폭발력 지녀”        
구글은 지난해 ‘마젠타 프로젝트(magenta.tensorflow.org)’를 시작했다. 인공지능을 통해 예술 창작 학습 알고리듬을 설계하는 프로그램이다. 그리고 지금 클래식 음악을 작곡하고, 새로운 이미지를 창조하고 있는 중이다. 마젠타 프로젝트를 이끌고 있는 더글러스 에크(Douglas Eck)는 9일 ‘사이언스’ 지와의 인터뷰를 통해 이 예술작품 창작이 가능한 인공지능의 과거·현재·미래에 대해 언급했다. ‘사이언스’ 지는 이 인터뷰가 간결·명료하게 진행됐다고 밝혔다.
          è°·æ­Œå›žåº”重返中国猜测:目前没计划        
 æžé€Ÿä½“育12月8日消息,2016年谷歌中国开发者大会在北京国家会议中心举行,谷歌大中华区总裁石博盟(Scott Beaumont)、谷歌全球开发者产品总监Ben等均出席了大会。 本次大会面向中国开发者,主旨是帮助中国人开发的应用走出国门,谷歌在会上向中国开发者介绍了PWA、Angular、TensorFlow等一系列平台和工具。此外,谷歌这次还开通了简体中文版.cn域名的Google Developers网站(developers.google.cn)。网站内容包含Android开发所需的SDK、Android Studio和一些谷歌服务API,此外谷歌还开设了颇具
          [chatbot][ml] Chatbot と beam search        

seq2seq で decoder から出力するときに beam search の方が出現確率が高いモノを選べるので結果が良くなるはず。beam search の詳細は karino2氏の解説。beam search 自体はシンプルで理解しやすいアルゴリズムなのだけど、Tensorflow のグラフでこれを書くのは非常に難易度が高い。使っている seq2seq library には loop_function という引数があってこれで一つ前の input を横取りしてあれこれできる。ということでずっとコードリーディングしてたんだがそれでも難しい。ということで試行錯誤しているうちに他の人が実装しているのを見つけてようやく動いた。 beam search のほうが結果が良いのかいまいち分からない。

>おはよう
normal:おはようございます
beam
0 おはよう ござい ます
1 お は あり
2 お は あり です 〜 ♪

>こんにちは
normal:はい(˘Ï‰˘)
beam
0 はい ( ˘ ω ˘ )
1 はい ( ˘ ω ˘ ) スヤァ
2 はい ( ˙ㅿ˙ 。 .
3 はい ♡
4 はい ( ´ ω 。
5 はい 、 さ www
6 はい ( 笑

>ばいばいー
わろきちってんじゃんwww
normal:beam
0 がち やし ま ー ん
1 いや ま ー !
2 わろ ぶ や !
3 ほら
4 ネタ やし ぶ
5 ど ま ー
6 がち やし ま ーー
7 いつの間に ま ー
8 す
9 いつの間に ぶ
10 いつの間に やし ぶ うち
11 やらかし た ❤
12 現実 やし
13 ほんま やし ぶ ()
14 や ま ー

>(月曜日から)逃げちゃ駄目だ……!
normal;えぇこれは、、、
beam
0 なんで 進捗 は これ じゃ ねぇ ・ ・ ω !
1 え ぇ これ は 光 は 、 !
2 え ぇ これ は 嫌 )
3 なんで 進捗 おっ け ( ω !
4 なんで 進捗 は これ じゃ ぞ 〜

> 子供たちにつられて苦手なミニオンズ…(´・ω・`)w
normal:気をしてねー(˘Ï‰˘)
beam
0 気 を し て ( ˘ つ ω -(´∀`; )
1 気 を すん な ( ˙ ˘ )
2 仕事 を すん や ( ˙ ω -(´∀`; )
3 気 を し て ねー 。 ( ^ ー ` ・ )
4 気 を し てる やろ ( ˙ ˘ ω ˘ ・) !
5 気 を し てる やろ ( ˙ ˘ ω ˘ ω ・ )
6 気 を し てる の だ よ )

> 中華そば醤油をいただきました💕お、おいしい〜😍大盛いけたかも?
normal: 追加ですよねwww
beam
0 追加 し まし た ☺
1 追加 です よ ☺ 
2 追加 です よ ね www

          [ML] Style Transfer        

Style Transfer を Stanford の機械学習クラスの CS 20SI の課題で実装してみた。Style Transfer とは絵画の写真から作風を特徴として、別の画像にその作風を適用するもの。Style Loss と Content Loss の和を最小化する方向で画像を生成するのが面白い。しかしながら一方でレイヤーの構造とかもろもろ職人芸的で自分で一から構築できる自信がない。課題のコードはこの辺。

その1

f:id:higepon:20170605193530p:image

f:id:higepon:20170605193626j:image

f:id:higepon:20170605193809p:image

その2

f:id:higepon:20170605193530p:image

f:id:higepon:20170605193813j:image

f:id:higepon:20170605193629p:image


          TensorFlow        

"TensorFlow is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) that flow between them. This flexible architecture lets you deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile device without rewriting code."

Quote from: https://github.com/tensorflow/tensorflow


          7월 3주 Technology & Industry HOT 7 뉴스        

7월 셋째 주, 세계 미디어를 뜨겁게 달궜던 기술 및 창업 소식 7개를 에티가 전해드립니다.


1. 중국 정부, 2030년까지 AI산업을 선도하기 위한 1500억 달러 투자 발표

중국 정부가 본격적으로 인공지능 시장에 개입하기 시작했습니다. 22일 중국 정부의 발표에 따르면 2030년까지 세계 AI 산업을 선도하기 위해 단계별로 AI 기술 개발 투자 및 상업화 육성을 추진하겠다고 밝혔습니다



국무원에 따르면 2020년까지 선진국의 인공지능 분야 기술 수준을 따라잡고, 25년까지 몇몇 분야에서 세계 1위를 제패한 뒤 30년에 미국을 넘어서는 AI 중심국가로 거듭날 것이라며 구체적인 사안들을 제시했습니다. 특히 정부 차원에서의 협력뿐 아니라 대학, 기업 간 협력 또한 구체적으로 제시함으로써 바이두 및 알리바바 등 이미 AI분야를 선도하는 기업들의 투자 확대 계획 및 정부 주관 프로젝트 준비 계획을 발표했습니다. 이에 더해 경제 성장 활성화 도구로 AI 기술을 활용하겠다며 국가 전반적인 산업에 대한 로드맵을 제시하였습니다. 이미 몇몇 분야에서 AI 기술의 강자로 거듭나고 있는 중국이기에, 이와 같은 중국정부의 전폭적인 지지 선언은 세계적으로 주목을 받고 있습니다.

 

2. 비트코인 시스템 업그레이드, 비트코인 분열 가능성 종식

비트코인이 개선안 BIP91을 발표하며 비트코인 블록체인의 분할 가능성이 사라졌습니다. 이에 따라 비트코인의 가격은 다시 안정을 되찾게 되었습니다. 


최근 비트코인에서는 SeqWit2X라는 플랫폼을 제시하였는데, 이에 대한 이용자 및 개발자들의 의견을 모으지 못해 비트코인이 분할될 우려가 있었습니다. SegWit2X 방식은 거래 기록 중 witness를 분리해 거래 기록을 더 포함할 수 있도록 만드는 방식으로, 현재 과부하된 비트코인의 블록 용량을 늘릴 수 있도록 합니다. 만약 SegWit2X 플랫폼이 인정되지 못한다면 비트코인 블록체인을 분할하여 강제로 사용자가 플랫폼에 참여해야 하는 상황이 나오게 되어 비트코인의 가치가 떨어질 수 있습니다.



 ê¸°ì¡´ 개선안에서는 이 SegWit2X에 대한 지지를 95% 이상 얻어야 한다는 조건을 제시하였으나, 개선안 BIP91에서는 지지율 지준을 80%로 대폭 낮추는 조건을 제시하였습니다. 이에 따라 비트코인이 분할될 가능성이 사라지고 Segwit2X 채택이 거의 확실시 되었습니다. SegWit2X가 채택된다면 기존 비트코인 블록체인이 가진 거래 기능에 더하여 비트코인을 보다 쉽고 빠르게 전달할 수 있게 되므로 비트코인 시장이 좀 더 활성화될 수 있습니다.


3. 동남아시아에서 벌어지고 있는 알리바바와 텐센트의 전쟁

Google-Temasek 보고서에 따르면, 동남아시아의 인터넷 사용자 인구 수는 2016년 2억6천만명이며 달마다 사용자 수가 3천800만명씩 늘고 있다고 합니다. 2020년까지 총 4억8천명 이상의 인구가 인터넷을 이용하게 될 동남아시아를 노리고 있는 거대 기술 기업들은 많습니다. 이 중에서도 중국의 두 거인인 알리바바와 텐센트의 투자가 가장 적극적입니다. 이 두 기업은 동남아시아에서 발현하고 있는 여러 스타트업에 많은 자본을 쏟아붓고 있습니다.


먼저 알리바바는 동남아시아의 아마존이라고 불리는 Lazada에 작년 4월과 올해 6월 두번에 걸쳐 각각 10억 달러씩 투자를 하며 Lazada의 사업 확장을 크게 도와주고 있습니다. 아마존이 올해 2분기에 동남아시아에 진출할 예정인 와중에 담대한 전략적 투자로 아마존을 막기 위한 장벽을 쌓고 있다고 ë³¼ 수 있습니다. Lazada는 알리바바의 지원에 힘입어 6개의 나라에 진출했고 식료품 기업 Redmart를 인수하여 아마존과 비슷하게 식료품 시장에도 발빠르게 진입하고 있습니다. 또한 알리바바는 Ant Financial(구 알리페이)를 통해 아시아 시장의 핀테크 산업 성장에 크게 개입하고 있는데요, 한국의 카카오페이에 2억 달러를 투자한 것과 비슷하게 동남아시아의 핀테크 스타트업 Ascend Money, Mynt, Emtek, M-Daq 등에도 투자했습니다. 



한편 텐센트는 동남아시아의 미디어 기업들에게 좀더 집중하고 있는데요, 태국 기반의 미디어 기업인 Sanook에 장기간에 걸친 투자를 진행하고 있고, Joox라는 음악 스트리밍 서비스를 동남아시아에서 런칭해 Spotify의 장악을 막고 있는 중입니다.


두 기업이 같은 시장에 눈독을 들이고 있는 만큼 두 기업 간의 투자 전쟁은 불가피할 것 같습니다. 예를 들어 동남아시아 최고의 유니콘 중 하나인 Grab의 새로운 투자 라운드에 알리바바가 포함될 것이라는 예측이 있는 가운데, 5월 텐센트는 Grab의 라이벌인 Go-Jek에 12억 달러를 지원했습니다. 한편 텐센트의 투자를 받은 이커머스 기업 JD.com에 대해 알리바바 역시 투자 유치를 진행하는 중이라고 합니다.


구글, 페이스북, 아마존 등 미국의 여러 IT 기업들 역시 동남아시아의 시장에 눈독을 들이는 중이지만 주로 자사의 서비스를 론칭하는 일에 집중하고 있는데요, 그만큼 동남아시아에서는 알리바바와 텐센트의 역할이 조금 더 중요해보입니다. 미국, 유럽 등의 외부 기업들이 아시아 시장에 진출할 수 있는 요충지로서든, 떠오르고 있는 거대한 시장으로서든 동남아시아 시장의 흐름은 앞으로 더욱 눈여겨보아야 할 것 같습니다.


4. 알렉사, 드디어 아마존 앱으로 들어가다

아마존은 알렉사에 대한 검증을 드디어 끝낸 것일까요? 이번주부터 아마존의 안드로이드 앱에서 일부 사용자들은 알렉사를 바로 쓸 수 있게 됩니다. 주로 Amazon Echo를 구매한 유저부터 기능 배포를 시작할 것이라고 합니다. 아마존 알렉사는 기존에 독립적인 안드로이드 앱이 있었는데요, 알렉사를 아마존 앱에도 통합시키면서 아마존의 거대한 사용자층에 본격적으로 알렉사의 사용성을 홍보할 전망입니다. 아마존 앱에서 사용할 수 있는 알렉사의 기능으로는 알렉사의 기본적인 기능 뿐만 아니라 상품 검색이나 주문 확인, 날씨 및 교통 확인, 음악 재생 등과 스마트홈 컨트롤 등이 포함된다고 합니다.




5. IPU 칩 개발 스타트업 Graphcore, Series B 펀딩에서 3000만달러 확보

차세대 인공신경망의 강자로 떠오르는 인공지능 스타트업 Graphcore가 Series B 펀딩에서 3000만달러를 확보하며 승승장구하고 있습니다. 투자자들 중에서는 알파고와의 경기로 세간의 주목을 이끌었던 Deepmind의 공동설립자 Demis Hassabis와 Uber의 기술고문 Zoubin Gharamani, OpenAI의 공동설립자 Greg Brockmann, 그리고 세계적인 딥러닝 석학 Pieter Abbeel 등도 포함되었습니다. 기존의 투자자 Samsung Catalyst Fund ë° 보쉬 벤처 캐비탈 등도 투자에 가담했습니다.




GraphCore는 기계 학습 및 인공지능 관련 프로세싱을 가속화할 수 있는 지능형 프로세싱 유닛 IPU(Intelligence Processing Unit)을 개발하고 있습니다. 현재 인공지능 학습에 많이 사용되는 GPU 등과 맞서 훨씬 더 효율적인 연산처리능력을 보여줄 것으로 기대됩니다. 이에 더해 요즘 많이 쓰이고 있는 강화학습 프레임워크 Tensorflow, Caffe2, MxNet 등과 호환되는 소프트웨어도 개발 중으로, 향후 인공지능 학습 프레임워크를 통합할 수 있는 솔루션을 제공할 것으로 예상됩니다.


6. Mira, 홀로렌즈에 대항할 99$짜리 AR 헤드셋 Prism 개발

마이크로소프트의 홀로렌즈에 대항하여 미국의 스타트업 Mira가 99$밖에 안하는 증강현실 헤드셋 Prism을 선보였습니다. 이 헤드셋은 삼성 gear VRê³¼ 같이 아이폰을 헤드셋에 끼워 사용하는 방식으로, 애플의 ARKit을 사용하여 개발되었습니다. Gear VR이 헤드셋 정면에 핸드폰을 꽂아 사용한다면, Prism은 반대 방향으로 끼워 증강 현실 환경을 만들어 냅니다. 



아이폰 내의 자이로 센서 및 화면을 이용한 Prism은 어안렌즈를 사용하여 증강현실 환경을 만드는데, 이 덕분에 컨텐츠를 표시할 수 있는 각도가 넓어졌다고 합니다. 구글의 Daydream View나 삼성의 Gear VR이 스마트폰 기반 증강현실 시장에 뛰어들고, 홀로렌즈가 매우 비싼 가격으로 출시된 현 상황에서 적절한 게임 컨텐츠 개발 및 SDK 배포가 이루어진다면 Mira의 Prism 또한 충분한 경쟁력을 가질 수 있을 것으로 생각됩니다.


7. 자동차 승차공유 플랫폼 Lyft, 자율주행차 개발팀 신설

우버와 함께 미국의 차량공유 생태계를 이끌어 나가던 Lyft가 자율주행자동차 산업에 뛰어들었습니다. 목요일 Lyft 회사 간부에 따르면 Lyft에서 직접 자율주행차 구동 소프트웨어 및 하드웨어를 만들기 위해 수백여명의 엔지니어들을 뽑는다고 발표했습니다. 이전에 Lyft에서는 구글의 자회사 Waymo 및 Mutonomy, GM 등과 자율주행차 개발을 위해 파트너십을 맺어 왔습니다. Lyft가 자율주행 산업에 뛰어드는 것은 이러한 협력 관계에 부정적인 영향을 끼칠 것으로 보입니다.



Lyft의 경영진들은 앞선 Uber처럼 차량공유 서비스의 가격 경쟁력을 끌어올리기 위해 자율주행차량 개발이 필수라고 생각하는 입장을 피력해 왔습니다. 이에 더해 전문가들은 Lyft가 차량 제작을 직접 한다는 점에서 다른 자율주행 소프트웨어 개발 업체들보다 우위를 점할 수 있다고 보고 있습니다.




저작자 표시 비영리 변경 금지

          How machine learning in G Suite makes people more productive        

Email management, formatting documents, creating expense reports. These are just some of the time-sinks that can affect your productivity at work. At Google, this is referred to as “overhead”—time spent working on tasks that do not directly relate to creative output—and it happens a lot.

According to a Google study in 2015, the average worker spends only about 5 percent of his or her time actually coming up with the next big idea. The rest of our time is caught in the quicksand of formatting, tracking, analysis or other mundane tasks. That’s where machine learning can help.

Machine learning algorithms observe examples and make predictions based on data. In G Suite, machine learning models make your workday more efficient by taking over menial tasks, like scheduling meetings, or by predicting information you might need and surfacing it for you, like suggesting Docs.

Time spent chart

Source: Google Data, April 2015

Eliminating spam within Gmail using machine learning

One of the earliest machine learning use cases for G Suite was within Gmail. Historically, Gmail used a rule-based system, which meant our anti-spam team would create new rules to match individual spam patterns. Over a decade of using this process, we improved spam detection accuracy to 99 percent.

Starting in 2014, our team augmented this rule-based system to generate rules using machine learning algorithms instead, taking spam detection one step further. Now, we use TensorFlow and other machine learning to continually regenerate the “spam filter,” so the system has learned to predict which emails are most likely junk. Machine learning finds new patterns and adapts far quicker than previous manual systems—it’s a big part of the reason that more than one billion Gmail users avoid spam within their account.

See machine learning in your favorite G Suite apps

G Suite’s goal is to help teams accomplish more with its intelligent apps, no matter where they are in the world. And chances are, you’ve already seen machine learning integrated into your day-to-day work to do just that.

Smart Reply, for example, uses machine learning to generate three natural language responses to an email. So if you find yourself on the road or pressed for time and in need of a quick way to clear your inbox, let Smart Reply do it for you.
Smart Reply GIF

Explore in Docs, Slides and Sheets uses machine learning to eliminate time spent on mundane tasks, like tracking down documents or information on the web, reformatting presentations or performing calculations within spreadsheets.

Explore

Quick Access in Drive predicts and suggests files you might need within Drive. Using machine intelligence, Quick Access can predict files based on who you share files with frequently, when relevant meetings occur within your Calendar or if you tend to use files at certain times of the day.

Quick Access

To learn more about how machine intelligence can make your life easier, sign up for this free webinar on June 15, 2017, featuring experts from MIT Research, Google and other companies. You can also check out the Big Data and Machine Learning blog or watch this video from Google Cloud Next with Ryan Tabone, director of product management at Google, where he explains more about “overhead.”


          100 announcements (!) from Google Cloud Next '17        

San Francisco — What a week! Google Cloud Next ‘17 has come to the end, but really, it’s just the beginning. We welcomed 10,000+ attendees including customers, partners, developers, IT leaders, engineers, press, analysts, cloud enthusiasts (and skeptics). Together we engaged in 3 days of keynotes, 200+ sessions, and 4 invitation-only summits. Hard to believe this was our first show as all of Google Cloud with GCP, G Suite, Chrome, Maps and Education. Thank you to all who were here with us in San Francisco this week, and we hope to see you next year.

If you’re a fan of video highlights, we’ve got you covered. Check out our Day 1 keynote (in less than 4 minutes) and Day 2 keynote (in under 5!).

One of the common refrains from customers and partners throughout the conference was “Wow, you’ve been busy. I can’t believe how many announcements you’ve had at Next!” So we decided to count all the announcements from across Google Cloud and in fact we had 100 (!) announcements this week.

For the list lovers amongst you, we’ve compiled a handy-dandy run-down of our announcements from the past few days:

100-announcements-15

Google Cloud is excited to welcome two new acquisitions to the Google Cloud family this week, Kaggle and AppBridge.

1. Kaggle - Kaggle is one of the world's largest communities of data scientists and machine learning enthusiasts. Kaggle and Google Cloud will continue to support machine learning training and deployment services in addition to offering the community the ability to store and query large datasets.

2. AppBridge - Google Cloud acquired Vancouver-based AppBridge this week, which helps you migrate data from on-prem file servers into G Suite and Google Drive.

100-announcements-4

Google Cloud brings a suite of new security features to Google Cloud Platform and G Suite designed to help safeguard your company’s assets and prevent disruption to your business: 

3. Identity-Aware Proxy (IAP) for Google Cloud Platform (Beta) - Identity-Aware Proxy lets you provide access to applications based on risk, rather than using a VPN. It provides secure application access from anywhere, restricts access by user, identity and group, deploys with integrated phishing resistant Security Key and is easier to setup than end-user VPN.

4. Data Loss Prevention (DLP) for Google Cloud Platform (Beta) - Data Loss Prevention API lets you scan data for 40+ sensitive data types, and is used as part of DLP in Gmail and Drive. You can find and redact sensitive data stored in GCP, invigorate old applications with new sensitive data sensing “smarts” and use predefined detectors as well as customize your own.

5. Key Management Service (KMS) for Google Cloud Platform (GA) - Key Management Service allows you to generate, use, rotate, and destroy symmetric encryption keys for use in the cloud.

6. Security Key Enforcement (SKE) for Google Cloud Platform (GA) - Security Key Enforcement allows you to require security keys be used as the 2-Step verification factor for enhanced anti-phishing security whenever a GCP application is accessed.

7. Vault for Google Drive (GA) - Google Vault is the eDiscovery and archiving solution for G Suite. Vault enables admins to easily manage their G Suite data lifecycle and search, preview and export the G Suite data in their domain. Vault for Drive enables full support for Google Drive content, including Team Drive files.

8. Google-designed security chip, Titan - Google uses Titan to establish hardware root of trust, allowing us to securely identify and authenticate legitimate access at the hardware level. Titan includes a hardware random number generator, performs cryptographic operations in the isolated memory, and has a dedicated secure processor (on-chip).

100-announcements-7

New GCP data analytics products and services help organizations solve business problems with data, rather than spending time and resources building, integrating and managing the underlying infrastructure:

9. BigQuery Data Transfer Service (Private Beta) - BigQuery Data Transfer Service makes it easy for users to quickly get value from all their Google-managed advertising datasets. With just a few clicks, marketing analysts can schedule data imports from Google Adwords, DoubleClick Campaign Manager, DoubleClick for Publishers and YouTube Content and Channel Owner reports.

10. Cloud Dataprep (Private Beta) - Cloud Dataprep is a new managed data service, built in collaboration with Trifacta, that makes it faster and easier for BigQuery end-users to visually explore and prepare data for analysis without the need for dedicated data engineer resources.

11. New Commercial Datasets - Businesses often look for datasets (public or commercial) outside their organizational boundaries. Commercial datasets offered include financial market data from Xignite, residential real-estate valuations (historical and projected) from HouseCanary, predictions for when a house will go on sale from Remine, historical weather data from AccuWeather, and news archives from Dow Jones, all immediately ready for use in BigQuery (with more to come as new partners join the program).

12. Python for Google Cloud Dataflow in GA - Cloud Dataflow is a fully managed data processing service supporting both batch and stream execution of pipelines. Until recently, these benefits have been available solely to Java developers. Now there’s a Python SDK for Cloud Dataflow in GA.

13. Stackdriver Monitoring for Cloud Dataflow (Beta) - We’ve integrated Cloud Dataflow with Stackdriver Monitoring so that you can access and analyze Cloud Dataflow job metrics and create alerts for specific Dataflow job conditions.

14. Google Cloud Datalab in GA - This interactive data science workflow tool makes it easy to do iterative model and data analysis in a Jupyter notebook-based environment using standard SQL, Python and shell commands.

15. Cloud Dataproc updates - Our fully managed service for running Apache Spark, Flink and Hadoop pipelines has new support for restarting failed jobs (including automatic restart as needed) in beta, the ability to create single-node clusters for lightweight sandbox development, in beta, GPU support, and the cloud labels feature, for more flexibility managing your Dataproc resources, is now GA.

100-announcements-9

New GCP databases and database features round out a platform on which developers can build great applications across a spectrum of use cases:

16. Cloud SQL for Postgre SQL (Beta) - Cloud SQL for PostgreSQL implements the same design principles currently reflected in Cloud SQL for MySQL, namely, the ability to securely store and connect to your relational data via open standards.

17. Microsoft SQL Server Enterprise (GA) - Available on Google Compute Engine, plus support for Windows Server Failover Clustering (WSFC) and SQL Server AlwaysOn Availability (GA).

18. Cloud SQL for MySQL improvements - Increased performance for demanding workloads via 32-core instances with up to 208GB of RAM, and central management of resources via Identity and Access Management (IAM) controls.

19. Cloud Spanner - Launched a month ago, but still, it would be remiss not to mention it because, hello, it’s Cloud Spanner! The industry’s first horizontally scalable, globally consistent, relational database service.

20. SSD persistent-disk performance improvements - SSD persistent disks now have increased throughput and IOPS performance, which are particularly beneficial for database and analytics workloads. Read these docs for complete details about persistent-disk performance.

21. Federated query on Cloud Bigtable - We’ve extended BigQuery’s reach to query data inside Cloud Bigtable, the NoSQL database service for massive analytic or operational workloads that require low latency and high throughput (particularly common in Financial Services and IoT use cases).

100-announcements-11

New GCP Cloud Machine Learning services bolster our efforts to make machine learning accessible to organizations of all sizes and sophistication:

22.  Cloud Machine Learning Engine (GA) - Cloud ML Engine, now generally available, is for organizations that want to train and deploy their own models into production in the cloud.

23. Cloud Video Intelligence API (Private Beta) - A first of its kind, Cloud Video Intelligence API lets developers easily search and discover video content by providing information about entities (nouns such as “dog,” “flower”, or “human” or verbs such as “run,” “swim,” or “fly”) inside video content.

24. Cloud Vision API (GA) - Cloud Vision API reaches GA and offers new capabilities for enterprises and partners to classify a more diverse set of images. The API can now recognize millions of entities from Google’s Knowledge Graph and offers enhanced OCR capabilities that can extract text from scans of text-heavy documents such as legal contracts or research papers or books.

25. Machine learning Advanced Solution Lab (ASL) - ASL provides dedicated facilities for our customers to directly collaborate with Google’s machine-learning experts to apply ML to their most pressing challenges.

26. Cloud Jobs API - A powerful aid to job search and discovery, Cloud Jobs API now has new features such as Commute Search, which will return relevant jobs based on desired commute time and preferred mode of transportation.

27. Machine Learning Startup Competition - We announced a Machine Learning Startup Competition in collaboration with venture capital firms Data Collective and Emergence Capital, and with additional support from a16z, Greylock Partners, GV, Kleiner Perkins Caufield & Byers and Sequoia Capital.

100-announcements-10

New GCP pricing continues our intention to create customer-friendly pricing that’s as smart as our products; and support services that are geared towards meeting our customers where they are:

28. Compute Engine price cuts - Continuing our history of pricing leadership, we’ve cut Google Compute Engine prices by up to 8%.

29. Committed Use Discounts - With Committed Use Discounts, customers can receive a discount of up to 57% off our list price, in exchange for a one or three year purchase commitment paid monthly, with no upfront costs.

30. Free trial extended to 12 months - We’ve extended our free trial from 60 days to 12 months, allowing you to use your $300 credit across all GCP services and APIs, at your own pace and schedule. Plus, we’re introduced new Always Free products -- non-expiring usage limits that you can use to test and develop applications at no cost. Visit the Google Cloud Platform Free Tier page for details.

31. Engineering Support - Our new Engineering Support offering is a role-based subscription model that allows us to match engineer to engineer, to meet you where your business is, no matter what stage of development you’re in. It has 3 tiers:

  • Development engineering support - ideal for developers or QA engineers that can manage with a response within four to eight business hours, priced at $100/user per month.
  • Production engineering support provides a one-hour response time for critical issues at $250/user per month.
  • On-call engineering support pages a Google engineer and delivers a 15-minute response time 24x7 for critical issues at $1,500/user per month.

32. Cloud.google.com/community site - Google Cloud Platform Community is a new site to learn, connect and share with other people like you, who are interested in GCP. You can follow along with tutorials or submit one yourself, find meetups in your area, and learn about community resources for GCP support, open source projects and more.

100-announcements-8

New GCP developer platforms and tools reinforce our commitment to openness and choice and giving you what you need to move fast and focus on great code.

33. Google AppEngine Flex (GA) - We announced a major expansion of our popular App Engine platform to new developer communities that emphasizes openness, developer choice, and application portability.

34. Cloud Functions (Beta) - Google Cloud Functions has launched into public beta. It is a serverless environment for creating event-driven applications and microservices, letting you build and connect cloud services with code.

35. Firebase integration with GCP (GA) - Firebase Storage is now Google Cloud Storage for Firebase and adds support for multiple buckets, support for linking to existing buckets, and integrates with Google Cloud Functions.

36. Cloud Container Builder - Cloud Container Builder is a standalone tool that lets you build your Docker containers on GCP regardless of deployment environment. It’s a fast, reliable, and consistent way to package your software into containers as part of an automated workflow.

37. Community Tutorials (Beta)  - With community tutorials, anyone can now submit or request a technical how-to for Google Cloud Platform.

100-announcements-9

Secure, global and high-performance, we’ve built our cloud for the long haul. This week we announced a slew of new infrastructure updates. 

38. New data center region: California - This new GCP region delivers lower latency for customers on the West Coast of the U.S. and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

39. New data center region: Montreal - This new GCP region delivers lower latency for customers in Canada and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

40. New data center region: Netherlands - This new GCP region delivers lower latency for customers in Western Europe and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

41. Google Container Engine - Managed Nodes - Google Container Engine (GKE) has added Automated Monitoring and Repair of your GKE nodes, letting you focus on your applications while Google ensures your cluster is available and up-to-date.

42. 64 Core machines + more memory - We have doubled the number of vCPUs you can run in an instance from 32 to 64 and up to 416GB of memory per instance.

43. Internal Load balancing (GA) - Internal Load Balancing, now GA, lets you run and scale your services behind a private load balancing IP address which is accessible only to your internal instances, not the internet.

44. Cross-Project Networking (Beta) - Cross-Project Networking (XPN), now in beta, is a virtual network that provides a common network across several Google Cloud Platform projects, enabling simple multi-tenant deployments.

100-announcements-16

In the past year, we’ve launched 300+ features and updates for G Suite and this week we announced our next generation of collaboration and communication tools.

45. Team Drives (GA for G Suite Business, Education and Enterprise customers) - Team Drives help teams simply and securely manage permissions, ownership and file access for an organization within Google Drive.

46. Drive File Stream (EAP) - Drive File Stream is a way to quickly stream files directly from the cloud to your computer With Drive File Steam, company data can be accessed directly from your laptop, even if you don’t have much space on your hard drive.

47. Google Vault for Drive (GA for G Suite Business, Education and Enterprise customers) - Google Vault for Drive now gives admins the governance controls they need to manage and secure all of their files, including employee Drives and Team Drives. Google Vault for Drive also lets admins set retention policies that automatically keep what’s needed and delete what’s not.

48. Quick Access in Team Drives (GA) - powered by Google’s machine intelligence, Quick Access helps to surface the right information for employees at the right time within Google Drive. Quick Access now works with Team Drives on iOS and Android devices, and is coming soon to the web.

49. Hangouts Meet (GA to existing customers) - Hangouts Meet is a new video meeting experience built on the Hangouts that can run 30-person video conferences without accounts, plugins or downloads. For G Suite Enterprise customers, each call comes with a dedicated dial-in phone number so that team members on the road can join meetings without wifi or data issues.

50. Hangouts Chat (EAP) - Hangouts Chat is an intelligent communication app in Hangouts with dedicated, virtual rooms that connect cross-functional enterprise teams. Hangouts Chat integrates with G Suite apps like Drive and Docs, as well as photos, videos and other third-party enterprise apps.

51. @meet - @meet is an intelligent bot built on top of the Hangouts platform that uses natural language processing and machine learning to automatically schedule meetings for your team with Hangouts Meet and Google Calendar.

52. Gmail Add-ons for G Suite (Developer Preview) - Gmail Add-ons provide a way to surface the functionality of your app or service directly in Gmail. With Add-ons, developers only build their integration once, and it runs natively in Gmail on web, Android and iOS.

53. Edit Opportunities in Google Sheets - with Edit Opportunities in Google Sheets, sales reps can sync a Salesforce Opportunity List View to Sheets to bulk edit data and changes are synced automatically to Salesforce, no upload required.

54. Jamboard - Our whiteboard in the cloud goes GA in May! Jamboard merges the worlds of physical and digital creativity. It’s real time collaboration on a brilliant scale, whether your team is together in the conference room or spread all over the world.

100-announcements-17

Building on the momentum from a growing number of businesses using Chrome digital signage and kiosks, we added new management tools and APIs in addition to introducing support for Android Kiosk apps on supported Chrome devices. 

55. Android Kiosk Apps for Chrome - Android Kiosk for Chrome lets users manage and deploy Chrome digital signage and kiosks for both web and Android apps. And with Public Session Kiosks, IT admins can now add a number of Chrome packaged apps alongside hosted apps.

56. Chrome Kiosk Management Free trial - This free trial gives customers an easy way to test out Chrome for signage and kiosk deployments.

57. Chrome Device Management (CDM) APIs for Kiosks - These APIs offer programmatic access to various Kiosk policies. IT admins can schedule a device reboot through the new APIs and integrate that functionality directly in a third- party console.

58. Chrome Stability API - This new API allows Kiosk app developers to improve the reliability of the application and the system.

100-announcements-2

Attendees at Google Cloud Next ‘17 heard stories from many of our valued customers:

59. Colgate - Colgate-Palmolive partnered with Google Cloud and SAP to bring thousands of employees together through G Suite collaboration and productivity tools. The company deployed G Suite to 28,000 employees in less than six months.

60. Disney Consumer Products & Interactive (DCPI) - DCPI is on target to migrate out of its legacy infrastructure this year, and is leveraging machine learning to power next generation guest experiences.

61. eBay - eBay uses Google Cloud technologies including Google Container Engine, Machine Learning and AI for its ShopBot, a personal shopping bot on Facebook Messenger.

62. HSBC - HSBC is one of the world's largest financial and banking institutions and making a large investment in transforming its global IT. The company is working closely with Google to deploy Cloud DataFlow, BigQuery and other data services to power critical proof of concept projects.

63. LUSH - LUSH migrated its global e-commerce site from AWS to GCP in less than six weeks, significantly improving the reliability and stability of its site. LUSH benefits from GCP’s ability to scale as transaction volume surges, which is critical for a retail business. In addition, Google's commitment to renewable energy sources aligns with LUSH's ethical principles.

64. Oden Technologies - Oden was part of Google Cloud’s startup program, and switched its entire platform to GCP from AWS. GCP offers Oden the ability to reliably scale while keeping costs low, perform under heavy loads and consistently delivers sophisticated features including machine learning and data analytics.

65. Planet - Planet migrated to GCP in February, looking to accelerate their workloads and leverage Google Cloud for several key advantages: price stability and predictability, custom instances, first-class Kubernetes support, and Machine Learning technology. Planet also announced the beta release of their Explorer platform.

66. Schlumberger - Schlumberger is making a critical investment in the cloud, turning to GCP to enable high-performance computing, remote visualization and development velocity. GCP is helping Schlumberger deliver innovative products and services to its customers by using HPC to scale data processing, workflow and advanced algorithms.

67. The Home Depot - The Home Depot collaborated with GCP’s Customer Reliability Engineering team to migrate HomeDepot.com to the cloud in time for Black Friday and Cyber Monday. Moving to GCP has allowed the company to better manage huge traffic spikes at peak shopping times throughout the year.

68. Verizon - Verizon is deploying G Suite to more than 150,000 of its employees, allowing for collaboration and flexibility in the workplace while maintaining security and compliance standards. Verizon and Google Cloud have been working together for more than a year to bring simple and secure productivity solutions to Verizon’s workforce.

100-announcements-3

We brought together Google Cloud partners from our growing ecosystem across G Suite, GCP, Maps, Devices and Education. Our partnering philosophy is driven by a set of principles that emphasize openness, innovation, fairness, transparency and shared success in the cloud market. Here are some of our partners who were out in force at the show:

69. Accenture - Accenture announced that it has designed a mobility solution for Rentokil, a global pest control company, built in collaboration with Google as part of the partnership announced at Horizon in September.

70. Alooma - Alooma announced the integration of the Alooma service with Google Cloud SQL and BigQuery.

71. Authorized Training Partner Program - To help companies scale their training offerings more quickly, and to enable Google to add other training partners to the ecosystem, we are introducing a new track within our partner program to support their unique offerings and needs.

72. Check Point - Check Point® Software Technologies announced Check Point vSEC for Google Cloud Platform, delivering advanced security integrated with GCP as well as their joining of the Google Cloud Technology Partner Program.

73. CloudEndure - We’re collaborating with CloudEndure to offer a no cost, self-service migration tool for Google Cloud Platform (GCP) customers.

74. Coursera - Coursera announced that it is collaborating with Google Cloud Platform to provide an extensive range of Google Cloud training course. To celebrate this announcement  Coursera is offering all NEXT attendees a 100% discount for the GCP fundamentals class.

75. DocuSign - DocuSign announced deeper integrations with Google Docs.

76. Egnyte - Egnyte announced an enhanced integration with Google Docs that will allow our joint customers to create, edit, and store Google Docs, Sheets and Slides files right from within the Egnyte Connect.

77. Google Cloud Global Partner Awards - We recognized 12 Google Cloud partners that demonstrated strong customer success and solution innovation over the past year: Accenture, Pivotal, LumApps, Slack, Looker, Palo Alto Networks, Virtru, SoftBank, DoIT, Snowdrop Solutions, CDW Corporation, and SYNNEX Corporation.

78. iCharts - iCharts announced additional support for several GCP databases, free pivot tables for current Google BigQuery users, and a new product dubbed “iCharts for SaaS.”

79. Intel - In addition to the progress with Skylake, Intel and Google Cloud launched several technology initiatives and market education efforts covering IoT, Kubernetes and TensorFlow, including optimizations, a developer program and tool kits.

80. Intuit - Intuit announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

81. Liftigniter - Liftigniter is a member of Google Cloud’s startup program and focused on machine learning personalization using predictive analytics to improve CTR on web and in-app.

82. Looker - Looker launched a suite of Looker Blocks, compatible with Google BigQuery Data Transfer Service, designed to give marketers the tools to enhance analysis of their critical data.

83. Low interest loans for partners - To help Premier Partners grow their teams, Google announced that capital investment are available to qualified partners in the form of low interest loans.

84. MicroStrategy - MicroStrategy announced an integration with Google Cloud SQL for PostgreSQL and Google Cloud SQL for MySQL.

85. New incentives to accelerate partner growth - We are increasing our investments in multiple existing and new incentive programs; including, low interest loans to help Premier Partners grow their teams, increasing co-funding to accelerate deals, and expanding our rebate programs.

86. Orbitera Test Drives for GCP Partners - Test Drives allow customers to try partners’ software and generate high quality leads that can be passed directly to the partners’ sales teams. Google is offering Premier Cloud Partners one year of free Test Drives on Orbitera.

87. Partner specializations - Partners demonstrating strong customer success and technical proficiency in certain solution areas will now qualify to apply for a specialization. We’re launching specializations in application development, data analytics, machine learning and infrastructure.

88. Pivotal - GCP announced Pivotal as our first CRE technology partner. CRE technology partners will work hand-in-hand with Google to thoroughly review their solutions and implement changes to address identified risks to reliability.

89. ProsperWorks - ProsperWorks announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

90. Qwiklabs - This recent acquisition will provide Authorized Training Partners the ability to offer hands-on labs and comprehensive courses developed by Google experts to our customers.

91. Rackspace - Rackspace announced a strategic relationship with Google Cloud to become its first managed services support partner for GCP, with plans to collaborate on a new managed services offering for GCP customers set to launch later this year.

92. Rocket.Chat - Rocket.Chat, a member of Google Cloud’s startup program, is adding a number of new product integrations with GCP including Autotranslate via Translate API, integration with Vision API to screen for inappropriate content, integration to NLP API to perform sentiment analysis on public channels, integration with GSuite for authentication and a full move of back-end storage to Google Cloud Storage.

93. Salesforce - Salesforce announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

94. SAP - This strategic partnership includes certification of SAP HANA on GCP, new G Suite integrations and future collaboration on building machine learning features into intelligent applications like conversational apps that guide users through complex workflows and transactions.

95. Smyte - Smyte participated in the Google Cloud startup program and protects millions of actions a day on websites and mobile applications. Smyte recently moved from self-hosted Kubernetes to Google Container Engine (GKE).

96. Veritas - Veritas expanded its partnership with Google Cloud to provide joint customers with 360 Data Management capabilities. The partnership will help reduce data storage costs, increase compliance and eDiscovery readiness and accelerate the customer’s journey to Google Cloud Platform.

97. VMware Airwatch - Airwatch provides enterprise mobility management solutions for Android and continues to drive the Google Device ecosystem to enterprise customers.

98. Windows Partner Program- We’re working with top systems integrators in the Windows community to help GCP customers take full advantage of Windows and .NET apps and services on our platform.

99. Xplenty - Xplenty announced the addition of two new services from Google Cloud into their available integrations: Google Cloud Spanner and Google Cloud SQL for PostgreSQL.

100. Zoomdata - Zoomdata announced support for Google’s Cloud Spanner and PostgreSQL on GCP, as well as enhancements to the existing Zoomdata Smart Connector for Google BigQuery. With these new capabilities Zoomdata offers deeply integrated and optimized support for Google Cloud Platform’s Cloud Spanner, PostgreSQL, Google BigQuery, and Cloud DataProc services.

We’re thrilled to have so many new products and partners that can help all of our customers grow. And as our final announcement for Google Cloud Next ’17 — please save the date for Next 2018: June 4–6 in San Francisco.

I guess that makes it 101. :-)



          Google lanza biblioteca TensorFlow Serving        
La biblioteca de código abierto puede trabajar con modelos de aprendizaje de máquina de TensorFlow de Google y modelos de terceros.
          El aprendizaje de máquina llega a su navegador a través de JavaScript        
Una nueva biblioteca JavaScript ejecuta TensorFlow de Google justo en el navegador con aceleración de GPU -una manera novedosa de llevar el aprendizaje de máquina a las masas.
          Data Skepticism with Kyle Polich        
With a fast-growing field like data science, it is important to keep some amount of skepticism. Tools can be overhyped, buzzwords can be overemphasized, and people can forget the fundamentals. If you have bad data, you will get bad results in your experimentation. If you don’t know what statistical approach you want to take to your data, it doesn’t matter how well you know Spark or TensorFlow. And if you

Continue reading...


          14.11.2015 11:50:40 Andrey_Perelygin        
Никто часом ли не знает хорошей документации по TensorFlow на русском, либо на хорошо воспринимаемом английском?
          10.11.2015 19:45:19 SkidanovAlex        
(edit: это ответ на комментарий на один уровень выше)

TensorFlow — это скорее не Theano, а Theano + Lasagne, потому что в TensorFlow есть все слои и апдейт-функции.
Самая большая критика в адрес Theano всегда была очень медленная компиляция моделей в C/cuda. В TensorFlow компиляции нет, поэтому для прототипирования сложных моделей он подходит намного лучше.

          10.11.2015 09:14:20 Infanty        
Мне видится проблема, что они пытаются туда включить всё, что относится к ИИ и обучению. Это с одной стороны хорошо, с другой стороны монстр. В данный момент множество библиотек в фонде Аpache которые могут почти то же самое. Но они на Java и из них самому нужно собирать свой велосипед. В большинстве случаев нужно решать конкретные задачи под которые нужен конкретный велосипед и иногда велосипед из старых технологий надёжней.

В любом случае TensorFlow пригодится для препарирования и поиска интересных решений в нём.
Спасибо за новость.
          10.11.2015 07:17:09 excoder        
Она есть здесь: googleresearch.blogspot.se/2015/11/tensorflow-googles-latest-machine_9.html. В переводе действительно буклет от маркетолога директора Google. Он встретился мне вчера ночью и я решил перевести его. По-хорошему надо перевести статью в ссылке, там несколько подробнее. День-два и появятся первые отчёты о пробах в блогах, тогда можно переводить уже их.
          10.11.2015 01:04:51 xGromMx        
Сам репозиторий github.com/tensorflow/tensorflow
          PowerAI Revolutionizes Deep Learning (Again!) with Release 4        

I’m excited to share with you that IBM has just released PowerAI release 4 which includes a technology preview of the record breaking Distributed Deep Learning technology we announced earlier this week. Drawing on IBM’s deep expertise in AI, in high-performance computing and system design, we have announced breakthrough results in both accuracy and performance in Deep Learning this week.

Using a cluster of 64 IBM advanced Deep Learning Power servers with 256 GPUs, we demonstrated a new speed record for training today’s most advanced neural networks in 50 minutes, a significant improvement over the hour-long training time reported by Facebook last month. At the same time, this work also significantly boosted neural network accuracy by over 13% for networks trained on very large data sets with over 7.5 million images, improving accuracy to 33.8% from the previous best results at 29.8% published by Microsoft (https://www.usenix.org/system/files/conference/osdi14/osdi14-paper-chilimbi.pdf).

Accelerating the training of Deep Neural Networks (DNNs) is not an idle competition but has direct impact on how DNNs can be applied to real-world problems. High-speed training enables AI developers and data scientists to develop better neural models for their applications by interactively exploring and optimizing DNN architectures. Thus, speed records are not about idle competition, but about empowering technology end users to find better solutions. Underscoring that dual commitment to advancing both the state of the art in AI and making the technology available to its users immediately, IBM is the only AI leader making the new technological advances available to all users concurrently with announcing the breakthrough by releasing them as a technology preview as part of PowerAI.

The advances were obtained by applying IBM’s deep expertise in system design and high-performance computing to deep learning with a close collaboration between the IBM Research and product divisions, combining PowerAI software, deep learning servers and excellence in high-performance computing research.

Drawing on IBM’s deep decade-long experience expertise in high-performance parallel systems, the Distributed Deep Learning framework achieves unprecedented scaling efficiency of 95%, ensuring the computing resources are efficiently used (https://www.ibm.com/blogs/research/2017/08/distributed-deep-learning/).

AI users can use the technologies that will power the world’s  fastest CORAL supercomputers with the CORAL supercomputer at the US national laboratories (https://openpowerfoundation.org/press-releases/department-of-energy-awards-425-million-for-next-generation-supercomputing-technologies/) to enhance the quality and speed of deep learning applications today.  Advances like these are only possible in an open standard-based environment that brings together the industry’s best technologies: IBM’s deep learning servers and CORAL technologies are created in the Open POWER ecosystem for collaborative innovation, together the IBM’s advanced Power system designs, Nvidia’s GPU numerical accelerators and Mellanox high-performance networking. 

The advances were obtained with the PowerAI for transforming scientific research and businesses with AI technologies which provides a stable, compatible environment for delivering breakthrough innovations. The Distributed Deep Learning technology is available today to PowerAI users with a technology preview for Caffe and TensorFlow in PowerAI Release 4 which is available for free download at ibm.biz/powerai today.


          Comment on Thoughts on the 2017 KDNuggets Poll on Data Science Tools by Jimmy Moon        
Great insight and the frank talk which didn't make me feel sort of marketting. By the way, you mentioned using RapidMiner with Tensorflow. Could you tell me where I can find materials about that?
          Tutorial: Deep Learning with R on Azure with Keras and CNTK        
by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) Microsoft's Cognitive Toolkit (better known as CNTK) is a commercial-grade and open-source framework for deep learning tasks. At present CNTK does not have a native R interface but can be accessed through Keras, a high-level API which wraps various deep learning backends including CNTK, TensorFlow, and Theano, for the convenience of modularizing deep neural network construction. The latest version of CNTK (2.1) supports Keras. The RStudio team has developed an R interface for Keras making it possible to run different deep learning backends, including CNTK, from...
          Big Data Processing with Apache Beam Python        

Description

Two trends for data analysis are the ever increasing size of data sets and the drive for lower-latency results. In this talk, we present Apache Beam--a parallel programming model that allows one to implement batch and streaming data processing jobs that can run on a variety of scalable execution engines like Spark and Dataflow--and its new Python SDK. We discuss some of the interesting challenges in providing a Pythonic API and execution environment for distributed processing, and show how Beam allows the user to write a Python pipeline once that can run in both batch and streaming mode. We walk through a few examples of data processing pipelines in Beam for use cases such as real time data analytics and feature engineering with Tensorflow for machine learning pipelines.


          Scientific Analysis at Scale - a Comparison of Five Systems        

Description

Scientific discoveries are increasingly driven by the analysis of large volumes of image data, and many tools and systems have emerged to support distributed data storage and scalable computation. It is not always immediately clear, however, how well these systems support real-world scientific use cases. Our team set out to evaluate the performance and ease-of-use of five such systems (SciDB, Myria, Spark, Dask, and TensorFlow), as applied to real-world image analysis pipelines drawn from astronomy and neuroscience. We find that each tool has distinct advantages and shortcomings, which point the way to new research opportunities in making large-scale scientific image analysis both efficient and easy to use.


          Automatic Code Generation with SymPy        

Description

Tutorial materials found here: https://scipy2017.scipy.org/ehome/220975/493423/

This tutorial will introduce code generation concepts using the SymPy library. SymPy is a pure Python library for symbolic mathematics. Code generation refers to the act of converting a SymPy symbolic expression into equivalent code in some language. This allows one to use SymPy to symbolically model a problem, and generate fast numerical code for specific platforms that executes that model. This is a powerful tool that is useful to scientists in many domains. Code generation allows users to speed up existing code, to deal only with the high level mathematics of a problem, avoids mathematical errors and typos, makes it possible to deal with expressions that would otherwise be too large to write by hand, and opens possibilities to perform automatic mathematical optimizations of expressions.

SymPy supports generating code for C, C++, Fortran, Matlab/Octave, Python, Cython, Julia, Javascript, LLVM, Rust, Haskell, Mathematica, Tensorflow, and Theano, and can easily be extended to other languages. SymPy’s code generation is used by libraries such as PyDy, pyodesys, sympybotics, pycalphad, and many other programs.

Learning objectives

Attendees will be able to:

  • write SymPy expressions describing mathematical functions and identify the function arguments and outputs.
  • use the SymPy code printers to transform SymPy expressions representing common domain specific functions into multiple output languages.
  • use the SymPy code generation routines to output compilable C code and use Cython to access these functions in Python.
  • generate custom vectorized functions with the three SymPy functions: lambdify, ufuncify, and autowrap.
  • create both custom code printers that make use of specialized C libraries and common subexpression elimination (CSE).
  • subclass the core SymPy printers and create a printer for a custom language.

          Evaluation of the Tensor Processing Unit: A Deep Neural Network Accelerator for the Datacenter [Special Seminar]        

With the ending of Moore's Law, many computer architects believe that major improvements in cost-energy-performance must now come from domain-specific hardware. The Tensor Processing Unit (TPU), deployed in Google datacenters since 2015, is a custom chip that accelerates deep neural networks (DNNs). We compare the TPU to contemporary server-class CPUs and GPUs deployed in the same datacenters. Our benchmark workload, written using the high-level TensorFlow framework, uses production DNN applications that represent 95% of our datacenters' DNN demand. The TPU is on average about 15X–30X faster than its contemporary GPU or CPU, with Performance/Watt 30X–80X higher.


          SystemX Seminar        

Over the past few years, we have built two large-scale computer systems for training neural networks, and then applied these systems to a wide variety of problems that have traditionally been very difficult for computers. We have made significant improvements in the state-of-the-art in many of these areas, and our software systems and algorithms have been used by dozens of different groups at Google to train state-of-the-art models for speech recognition, image recognition, various visual detection tasks, language modeling, language translation, and many other tasks. Our second-generation system, TensorFlow, has been designed and implemented based on what we have learned from building and using DistBelief, our first generation system. The TensorFlow API and an initial implementation was released as an open-source project in November, 2015 (see tensorflow.org). In this talk, I'll discuss the design and implementation of TensorFlow, and discuss some future directions for improving the system. This talk describes joint work with a large number of people at Google.


          How not to make millions predicting cryptocurrency prices with social media sentiment scores        

Sometimes hobbies can teach you basic survival skills.  In a world where machines are likely to replace many of today’s jobs, I figured that rolling up my sleeves and jumping into deep learning during my recent summer vacation would be both fun and useful.  In any case, I find it helpful to understand the work that I will eventually need to manage in my career.  As a passion and hobby blog, here are my musings into machine learning/deep learning.


I started with a business problem that I wanted to solve.  I’ve always been curious about cryptocurrencies and hypothesized that the market could be more open than the traditional stock market: a prime opportunity for data to help predict prices.  Moreover, investment firms have long hired the best mathematicians and computer scientists to model trade data.  With only a few days to dedicate to this project, I knew I wasn’t going to come up with anything too groundbreaking.  As an avid Redditor, I often wonder:  What if the information contained in reddit comments could serve as features to predict cryptocurrency prices like bitcoin, litecoin, ethereum, etc.?


First off, I delved into the Machine Learning for Trading course by Georgia Tech delivered through Udacity.   I really love MOOCs and wish they existed back when I was in HS and university.  This MOOC was very informative and the programming language used, Python, is my second favourite (after Groovy).  The concepts from stock market trading decisions from probabilistic machine learning approaches translate well to cryptocurrency trading decisions.  I also decided to use pythonanywhere.com as my cloud IDE.  It’s a cheap, clean and efficient dev environment and with great technical support.


Next, I needed data and lots of it!  This was definitely the most challenging step.  Getting cryptocurrency time series data that is highly granular and for free, took quite a bit of googling.  I eventually settled on writing a little script to access the API from cryptocompare.com.  Using python Pandas, I was able to save dataframe data into cvs files.  Now for the social media information, I used the popular PRAW Python Reddit API Wrapper to pull the top 25 comments from the top 25 posts of the top 25 subreddits.  I saved the upvote score into a table and used TextBlob to get the sentiment score of the comment itself.  This would be the basis of my main feature to help predict the currency prices.  On a side note, I also looked at using the Watson API from IBM’s Bluemix.  It provided a lot more information on the emotions found in the reddit comments. The API call limitations meant that there would be a cost associated with obtaining the sentiment for each of the 1000s of Reddit comments. Needless to say, my decision to use TextBlob was a no-brainer.


With the reddit comments data and associated sentiment scores, I created a time series table (dataframe) that lined up the comment time to the price time.  A little googling and stackoverflow later, I had my data ready for analysis.


Now the fun part could begin.  Using Google’s Open Source TensorFlow library, I created a neural net model that took the inputs from some of the cryptocurrency prices and sentiment scores with the output of the price of a single cryptocurrency.  Luckily, the fine folks that created Keras made it very easy to create neural net models from data.  Keras truly lowers the barrier of entry into deep learning.  Combined with Dr. Jason Brownlee’s tutorials, anyone with basic programming experience can start experimenting with deep learning.


Unfortunately, I didn’t make millions (of course) with my predictive model.  My hypothesis that the “internet’s” sentiment of positivity vs. negativity, derived from reddit comments, was a predictor of bitcoin, litecoin or ethereum prices was not supported by the models generated by my data. I did learn firm hand that data are really king and predictive algorithms are a dime a dozen. Eventually, I remembered to leave my home office to enjoy the warm summer weather!  Maybe next year, I’ll try and get higher quality data with a more granular analysis of emotions from social media combined with longer time series.  Or, you never know, a machine may be writing my next blog article...









          Tutoriel IA : Jouez avec Tensorflow le moteur d'IA de Google        

tensorflow

Tensorflow est un moteur d'intelligence artificielle Opensource développé par Google. Depuis le mois de février, Tensorflow a fait un grand pas en avant annonçant la première version complètement stable de son produit.

Tensorflow est aujourd'hui utilisé pour de nombreux cas d'usages: détection du cancer de la peau, prévention de la perte de vue chez les diabétiques ... Sur GitHub, vous trouverez pas moins de 8800 repository utilisant Tensorflow.

Tensorflow est développé en C++ et dispose d'une API Python. Des API Java et Go sont aussi à disposition en version expérimentale.

Si vous avez la chance d'avoir une carte graphique récente supportant CUDA 3.0 et supérieur, alors tournez-vous vers l'installation de Tensorflow en mode GPU, la compatibilité des cartes graphiques à Cuda peut être obtenue sur le site de NVidia. N'oubliez pas d'installer CUDA préalablement si votre carte est compatible.

$ pip install tensorflow-gpu  # Python 2.7;  GPU support
$ pip3 install tensorflow-gpu # Python 3.n; GPU support

Si vous n'avez pas de carte graphique compatible, tournez-vous vers une implémentation CPU de Tensorflow qui vous permettra de faire tourner les tutoriaux, mais celle-ci ne vous permettra pas de réaliser de l'apprentissage de manière efficace.

$ pip install tensorflow      # Python 2.7; CPU support (no GPU support)
$ pip3 install tensorflow     # Python 3.n; CPU support (no GPU support)

Une fois Tensorflow installé sur votre ordinateur, clonez le repository Git des models Tensorflow :

$ git clone https://github.com/tensorflow/models.git

Testez la reconnaissance d'image grâce à l'algorithme de classification des images :

$ cd models/tutorials/image/imagenet
$ python classify_image.py

La première analyse déclenche le téléchargement d'un référentiel de connaissance imagenet : L'image analysée est celle d'un panda :

giant panda, panda, panda bear, coon bear, Ailuropoda melanoleuca (score = 0.89632) indri, indris, Indri indri, Indri brevicaudatus (score = 0.00766) lesser panda, red panda, panda, bear cat, cat bear, Ailurus fulgens (score = 0.00266) custard apple (score = 0.00138) earthstar (score = 0.00104)

Vous pouvez maintenant essayer de catégoriser vos propres images, pour cela ajoutez la commande suivante au script de classification :

$ python classify_image.py --image_file monfichier.jpg

Le script vous retournera 5 propositions de classification avec des points de pondération allant de 0 à 1.

Enfin, si vous souhaitez garder précieusement la base de connaissance Imagenet dans un endroit particulier de votre système de fichier, l'argument "--model_dir" vous permet de préciser à l'emplacement des fichiers. Par défaut, la base de connaissance est téléchargée et installée dans le répertoire temporaire de votre ordinateur.

$ python classify_image.py --image_file monfichier.jpg --model_dir ./imagenet

Maintenant que vous arrivez à analyser des images grâce à inception, vous pouvez maintenant passer à l'étape suivante : l'apprentissage des images par Tensortflow. Rendez-vous dans un prochain article !

geeek
          Scaling Machine Learning Software with Allinea Tools        

"The majority of deep learning frameworks provide good out-of-the-box performance on a single workstation, but scaling across multiple nodes is still a wild, untamed borderland. This discussion follows the story of one researcher trying to make use of a significant compute resource to accelerate learning over a large number of CPUs. Along the way we note how to find good multiple-CPU performance with Theano* and TensorFlow*, how to extend a single-machine model with MPI and optimize its performance as we scale out and up on both Intel Xeon and Intel Xeon Phi architectures."

The post Scaling Machine Learning Software with Allinea Tools appeared first on insideHPC.


          Tutorial: Deep Learning with R on Azure with Keras and CNTK        
by Le Zhang (Data Scientist, Microsoft) and Graham Williams (Director of Data Science, Microsoft) Microsoft's Cognitive Toolkit (better known as CNTK) is a commercial-grade and open-source framework for deep learning tasks. At present CNTK does not have a native R interface but can be accessed through Keras, a high-level API which wraps various deep learning backends including CNTK, TensorFlow, and Theano, for the convenience of modularizing deep neural network construction. The latest version of CNTK (2.1) supports Keras. The RStudio team has developed an R interface for Keras making it possible to run different deep learning backends, including CNTK, from...
          27 个机器学习、数学、Python 速查表        

机器学习涉及到的方面非常多。当我开始准备复习这些内容的时候,我找到了许多不同的”速查表”, 这些速查表针对某一主题都罗列出了所有我需要知道的知识重点。最终我编译了超过 20 份机器学习相关的速查表,其中一些是我经常用到的而且我相信其他人也会从中受益。本文整理了我在网络上找到的 27 个速查表,我认为比较好。如果我有遗漏,欢迎补充。

如今机器学习领域的发展相当迅速,我可以想象出来这些资源将会很快过时,但是至少在当前,在2017年6月1日,他们都是相当流行的。

如果你们像我一样想要一次性批量下载所有资源,我已经将 27 个速查表整理打包( Dropbox 、 百度云 )好了,请尽情享用吧!

如果你喜欢本文,记得给我在下面点个 zan 哦。

机器学习

这里我从一些和机器学习算法相关的流程图和表格中选择了我认为最全面的几个并在下面罗列出来。

Neural Network Architectures

链接: http://www.asimovinstitute.org/neural-network-zoo/

The Neural Network Zoo

Microsoft Azure Algorithm Flowchart

链接: https://docs.microsoft.com/en-us/azure/machine-learning/machine-learning-algorithm-cheat-sheet

Machine learning algorithm cheat sheet for Microsoft Azure Machine Learning Studio

SAS Algorithm Flowchart

链接: http://blogs.sas.com/content/subconsciousmusings/2017/04/12/machine-learning-algorithm-use/

SAS: Which machine learning algorithm should I use?

Algorithm Summary

链接: http://machinelearningmastery.com/a-tour-of-machine-learning-algorithms/

A Tour of Machine Learning Algorithms

Which are the best known machine learning algorithms?

Algorithm Pro/Con

链接: https://blog.dataiku.com/machine-learning-explained-algorithms-are-your-friend

Python

网上在线的Python资源可以说是相当的多。在这一部分,我挑选了我遇到的几个最好的速查表呈献给大家。

ML算法

链接: https://www.analyticsvidhya.com/blog/2015/09/full-cheatsheet-machine-learning-algorithms/

Python基础

链接: http://datasciencefree.com/python.pdf

链接: https://www.datacamp.com/community/tutorials/python-data-science-cheat-sheet-basics#gs.0x1rxEA

Numpy

链接: https://www.dataquest.io/blog/numpy-cheat-sheet/

链接: http://datasciencefree.com/numpy.pdf

链接: https://www.datacamp.com/community/blog/python-numpy-cheat-sheet#gs.Nw3V6CE

链接: https://github.com/donnemartin/data-science-ipython-notebooks/blob/master/numpy/numpy.ipynb

Pandas

链接: http://datasciencefree.com/pandas.pdf

链接: https://www.datacamp.com/community/blog/python-pandas-cheat-sheet#gs.S4P4T=U

链接: https://github.com/donnemartin/data-science-ipython-notebooks/blob/master/pandas/pandas.ipynb

Matplotlib

链接: https://www.datacamp.com/community/blog/python-matplotlib-cheat-sheet

链接: https://github.com/donnemartin/data-science-ipython-notebooks/blob/master/matplotlib/matplotlib.ipynb

Scikit Learn

链接: https://www.datacamp.com/community/blog/scikit-learn-cheat-sheet#gs.fZ2A1Jk

链接: http://peekaboo-vision.blogspot.de/2013/01/machine-learning-cheat-sheet-for-scikit.html

链接: https://github.com/rcompton/ml_cheat_sheet/blob/master/supervised_learning.ipynb

Tensorflow

链接: https://github.com/aymericdamien/TensorFlow-Examples/blob/master/notebooks/1_Introduction/basic_operations.ipynb

Pytorch

链接: https://github.com/bfortuner/pytorch-cheatsheet

数学

如果你想真正的理解机器学习,你需要有扎实的统计学(尤其是概率论), 线性代数以及微积分基础。我在上大学的时候辅修了数学专业,但是我肯定还是需要对这些数学知识进行复习。如果你想理解常用机器学习算法背后的数学原理,那么下面的这些速查表将会是你需要的。

概率论

链接: http://www.wzchen.com/s/probability_cheatsheet.pdf

线性代数

链接: https://minireference.com/static/tutorials/linear_algebra_in_4_pages.pdf

统计学

链接: http://web.mit.edu/~csvoss/Public/usabo/stats_handout.pdf

 

来自:http://blog.jobbole.com/112009/

 


          æ¨ªå‘对比三大分布式机器学习平台:Spark、PMLS、TensorFlow        
本文是一篇对比现有分布式机器学习平台的论文,对 Spark、PMLS 和 TensorFlow 等平台的架构和性能进行了比较和介绍。
          CPB100: GCP Big Data & MachineLearningFundamentals - Nobleprog Canada Corp , Toronto, Montreal, Vancouver, Ottowa, Halifax, Edmonton         
This 8 hour instructor-led class introduces participants to the Big Data & Machine Learning capabilities of Google Cloud Platform. It provides a quick overview of the Google Cloud Platform and a deeper dive of the data processing capabilities. (For a more general overview of Google Cloud Platform, see CP100A)
This class is intended for:
Data analysts
Data scientists
Business analysts
It is also suitable for IT decision makers evaluating Google Cloud Platform for use by data scientists.
This class is for people who do the following with big data:
Extracting, Loading, Transforming, cleaning, and validating data for use in analytics
Designing pipelines and architectures for data processing
Creating and maintaining machine learning and statistical models
Querying datasets, visualizing query results and creating reports
At the end of this one­day course, participants will be able to:
Identify the purpose and value of the key Big Data and Machine Learning products in the Google Cloud Platform
Use CloudSQL and Cloud Dataproc to migrate existing MySQL and Hadoop/Pig/Spark/Hive workloads to Google Cloud Platform
Employ BigQuery and Cloud Datalab to carry out interactive data analysis
Choose between Cloud SQL, BigTable and Datastore
Train and use a neural network using TensorFlow
Choose between different data processing products on the Google Cloud Platform

Cost:

Certified


          CPB102: Machine Learning with CloudML Training - Nobleprog Canada Corp , Toronto, Montreal, Vancouver, Ottowa, Halifax, Edmonton         
Overview
This 8-hour instructor led course builds upon CPB100 and CPB101 (which are prerequisites). Through a combination of instructor-led presentations, demonstrations, and hands-on labs, students learn machine learning and Tensorflow concepts and develop hands-on skills in developing, evaluating, and productionizing machine learning models.
This class is intended for programmers and data scientists responsible for developing predictive analytics using machine learning.  The typical audience member has experience analyzing and visualizing big data, implementing cloud-based big data solutions, and transforming/processing datasets.
Objectives
Understand what kinds of problems machine learning can address
Build a machine learning model using TensorFlow
Build scalable, deployable ML models using Cloud ML
Know the importance of preprocessing and combining features
Incorporate advanced ML concepts into their models
Invoke and customize ML APIs
Productionize trained ML model

Cost:

Certified


          Deep Learning with TensorFlow Training Course - Nobleprog Canada Corp , Toronto, Montreal, Vancouver, Ottowa, Halifax, Edmonton         
Machine Learning and Recursive Neural Networks (RNN) basics
NN and RNN
Backprogation
Long short-term memory (LSTM)
TensorFlow Basics
Creation, Initializing, Saving, and Restoring TensorFlow variables
Feeding, Reading and Preloading TensorFlow Data
How to use TensorFlow infrastructure to train models at scale
Visualizing and Evaluating models with TensorBoard
TensorFlow Mechanics 101
Prepare the Data
Download
Inputs and Placeholders
Build the Graph
Inference
Loss
Training
Train the Model
The Graph
The Session
Train Loop
Evaluate the Model
Build the Eval Graph
Eval Output
Advanced Usage
Threading and Queues
Distributed TensorFlow
Writing Documentation and Sharing your Model
Customizing Data Readers
Using GPUs
Manipulating TensorFlow Model Files
TensorFlow Serving
Introduction
Basic Serving Tutorial
Advanced Serving Tutorial
Serving Inception Model Tutorial

Cost:

Certified


          Re: TensorFlow with R        

이 글을 쓴게 버전 1.0이 되기 전에 쓴 글이라서 텐서플로 버전 차이로 생기는 문제입니다. 말씀하셨던 것 처럼 하시면 잘 동작할 겁니다.


          Re: TensorFlow with R        

안녕하세요 선생님. 코드를 보고 공부하고 있는 사람입니다. 코드대로 입력할때 loss함수에서 에러가 뜹니다. 아마 저같은 사람과 있을까봐서 수정해서 올리도록 하겠습니다

with({tf$name_scope("loss");tf$device('/cpu:0')},{
loss <- tf$reduce_mean(tf$nn$sigmoid_cross_entropy_with_logits(labels=pred, logits=y))
optimizer <- tf$train$AdamOptimizer(learning_rate=0.01)$minimize(loss)
})

loss함수의 logits에서 labels와 logits를 지정해줘야 하더군요.. 시스템 차이일 수는 있으나 제 컴퓨터에선 이렇게 해야 먹히더라구요. 코드로 공부 엄청 하고 있습니다, 코드 공유해주신거 한번 더 진심으로 감사드립니다


          Re: TensorFlow with R        

안녕하세요.

제가 시간이 없어서 외부 교육은 잘 하고 있지 않습니다.
도움을 드리지 못해 죄송합니다.


          Re: TensorFlow with R        

안녕하세요. 통계청 통계교육원 직원입니다. R 기반의 데이터 시각화라는 pdf 책을 접하고 연락드립니다. 혹시 강의를 좀 요청할 수 있을까 해서요. 예전에 통계교육원의 통계세미나에서 강의를 하셨던 것 같습니다만... 제 메일(네임카드참조) 로 연락을 한번 주시면 감사하겠습니다.


          HRIntelligencer 1.11        
Highlights: Ten Ways HR Tech Leaders Can Make the Most of Artificial Intelligence, How AI Detectives Are Cracking Open the Black Box of Deep Learning, Hiring a Product Manager: A Little Clarity Goes a Long Way, and Google Stakes Its Future on TensorFlow, Their Machine Learning Software.
          Comment on Safe Retirement Spending Using Certainty Equivalent Cash Flow and TensorFlow by Weekend reading: Brexit so far. No pain. No gain. – Frozen Pension        
[…] An attempt to use machine learning to find the safe withdrawal rate for retirement [Note: Geeky!] – StreetEye […]
          Comment on Safe Retirement Spending Using Certainty Equivalent Cash Flow and TensorFlow by Druce Vertes, CFA        
This is left as an exercise for the reader. :) But yes, that's pretty much how it could work. One could estimate a covariance matrix and generate a distribution, or use a bootstrap method, drawing from the historical data with or without replacement.
          Comment on Safe Retirement Spending Using Certainty Equivalent Cash Flow and TensorFlow by Druce Vertes, CFA        
It's one optimization of a single value: the CE cash flow across all cohorts.
          Comment on Safe Retirement Spending Using Certainty Equivalent Cash Flow and TensorFlow by Your Name        
Also are the optimizations all separate or are they linked? eg does the variable rate spenddown take into account a changing asset allocation or does it hold it constant? tx
          Comment on Safe Retirement Spending Using Certainty Equivalent Cash Flow and TensorFlow by Your Name        
Nice post! can you show how to do the optimization replacing historical sims w monte carlos? do you just run a set of MC and swap them into the algo for historical runs and press "run"? thx!
          Comment on Safe Retirement Spending Using Certainty Equivalent Cash Flow and TensorFlow by Safe Retirement Spending Using TensorFlow – jedetag        
[…] PreviousNext Safe Retirement Spending Using Certainty Equivalent Cash Flow and TensorFlow […]
          Malaysia Open Source Conference (MOSC) 2017        
A three days Malaysia Open Source Conference (MOSC) ended last week. MOSC is an open source conference which is held annually and this year it reaches its 10 years anniversary. I managed to attend the conference with a selective focus on system administration related presentations, computer security and web application development.

The First Day

The first day's talks were occupied with keynotes from the conference sponsors and major IT brands. After the opening speech and a lightning talk from the community, Mr Julian Gordon delivered his speech which regards to the Hyperledger project, a blockchain technology based ledger. Later Mr Sanjay delivered his speech on the open source implementation in the financial sector in Malaysia. Before lunch break we then listened to Mr Jay Swaminathan from Microsoft whom presented his talks on Azure based service for blockchain technology.




For the afternoon part of the first day I then attended a talk by Mr Shak Hassan on the Electron based application development. You can read his slides here. I personally used Electron based application for Zulip so basically as a non web developer I already have a mental picture what Electron is prior to the talk, but the speaker's session enlightened me more on what was happening at the background of the application. Finally for the first day before I went back I attended a slot delivered by Intel Corp on Yocto Project - in which we could automate the process of creating a bootable Linux image to any platform - whether it is an Intel x86/x86_64 platform or ARM based platform.



The Second Day

The second day of the conference was started with a talk from Malaysia Digital Hub. The speaker, Diana, presented the state of Malaysian-based startups which are currently shaped and assisted by Malaysia Digital Hub and also the ones which already matured and able to stand by themselves. Later, a presenter from Google - Mr Dambo Ren - presented a talk on Google cloud projects.



He also pointed out several major services which are available on the cloud, for example - the TensorFlow. After that I chose to enter the Scilab software slot. Dr Khatim who is an academician shared his experience on using Scilab - an open source software which is similar to Matlab - to be used in his research and for his students. Later I entered a speaking slot with a title "Electronic Document Management System with Open Source Tools".


Here two speakers from Cyber Security Malaysia (an agency within the Malaysia's Ministry of Science and Technology) presented their studies on two open source document management software - OpenDocMan and LogicalDoc. The evaluation matrices were based from the following elements - the access easiness, costs, centralized repo, disaster recovery and the security features. From their observation LogicalDoc managed to get higher scores compared to OpenDocMan.

Later after that I attended a talk by Mr Kamarul on his experience using R language and R studio in his university for medical-based research. After the lunch break then it was my turn on delivering a workshop. Basically my talk was targeted upon the entry level system administration, in which I shared pretty much my experiences using tmux/screen, git, AIDE to monitor file changes on our machines and Ansible in order to automate common tasks as much as possible within the system administration context. I demonstrated the use of Ansible with multiple Linux distros - CentOS, Debian/Ubuntu in order to show how Ansible would handle heterogeneous Linux distribution after the command execution. Most of the presented stuffs were "live" during the workshop, but I also created a slides in order to help the audience and the public to get the basic ideas of the tools which I presented. You can read about them here [PDF].


The Third Day (Finale)

On the third day I came into the workshop slot which was delivered by a speaker with his pseudonym - Wak Arianto (not his original name though). He explained Suricata, a tool which has an almost similar syntax for pattern matching with the well known Snort IDS. Mr Wak explained OS fingerprinting concepts, flowbits and later how to create rules with Suricata. It was an interesting talk as I could see how to quarantine suspicious files captured from the network (let's say - possible malware) to a sandbox for further analysis. As far as I understood from the demo and from my extra readings, flowbits is a syntax which being used to grab the state of the session which being used by Suricata that works primarily with TCP in order to detect. You can read an article about flowbits here. It's being called a flowbits because it does the parsing on the TCP flows. I can see that we can parse the state of the TCP (for example, if it is established) based from the writings here.

I have a chance to listen to FreeBSD developer's slot too. We were lucky to have Mr Martin Wilke who is living in Malaysia and actively advocating FreeBSD to the local community. Together with Mr Muhammad Moinur Rahman - another FreeBSD developer they presented the FreeBSD development ecosystem and the current state of the operating system.



Possibly we preserved the best thing at the last - I attended a Wi-Fi security workshop which was presented by Mr Matnet and Mr Jep (both are pseudonyms). This workshop began with the theoretical foundations on the wireless technology and later the development of encryption around it.



The outline of the talks were outlined here. The speakers introduced the frame types of 802.11 protocols, which includes Control Frame, Data Frame and Management Frame. Management Frame is unencrypted so the attacking tools were developed to concentrate on this part.



The Management Frames is susceptible to the following attacks:
  • Deauthentication Attacks
  • Beacon Injection Attacks
  • Karma/MANA Wifi Attacks
  • EvilTwin AP Attacks

    Matnet and Jep also showed a social engineering tool called as "WiFi Phisher" in which it could be used as (according to the developer's page in GitHub) a "security tool that mounts automated victim-customized phishing attacks against WiFi clients in order to obtain credentials or infect the victims with malwares". It works together with the EvilTwin AP attacks by putting its role after achieving a man-in-the-middle position - Wifiphisher will redirect all HTTP requests to an attacker-controlled phishing page. Matnet told us the safest way to work within the WiFi environment is either using 802.11w supported device (which is yet to be widely found - at least in Malaysia). I found some infos on 802.11w that possibly could help to understand a bit on this protocol here.

    Conclusion

    For me this is considered the most anticipated annual event where I could meet professionals from different backgrounds and keeping my knowledge up to date with the latest development of the open source tools in the industry. The organizer surely had done a good job by organizing this event and I hope to attend this event again next year! Thank you for giving me opportunity to talk within this conference (and for the nice swag too!)

    Apart from MOSC I also planned to attend the annual Python Conference (Pycon) in which this year it is going to be special as it will be organized at the Asia Pacific (APAC) level. You can read more about Pycon APAC 2017 here (in case you probably would like to attend this event).


  •           Benchmarking TensorFlow on Cloud CPUs: Cheaper Deep Learning Than Cloud GPUs        
    Benchmarking TensorFlow on Cloud CPUs: Cheaper Deep Learning Than Cloud GPUs
              Fast Forward: CakePHP 2.10.1 veröffentlicht & .NET-Framework 4.7.1 Early Access        

    Heute im Fast Forward: CakePHP 2.10.1 veröffentlicht | Early Access von .NET-Framework 4.7.1 | TensorFlow Serving 1.0 verfügbar | Ladeverhalten von Custom Fonts | HTML 5 Cross-Browser-Polyfills

    Der Beitrag Fast Forward: CakePHP 2.10.1 veröffentlicht & .NET-Framework 4.7.1 Early Access ist auf entwickler.de erschienen.


              ç®€å•ä¸‰å±‚神经网络        
    #!/usr/bin/python # encoding:utf8 """ @author: james shu @contact: 598546998@qq.com @file: layer.py @time: 6/4/2017 4:21 PM """ import tensorflow as tf import numpy as np import matplotlib.pyplot as plt def add_layer (inputs , in_size , out_size , activation_fu ...<br /><img src="http://attachbak.dataguru.cn/attachments/album/201706/04/203621jesal0za6sxp6gl1.png.thumb.jpg">
              CNTK Revisited. A New Deep Learning Toolkit Release from Microsoft        
    In a pair of articles from last winter (first article, second article) we looked at Microsoft’s “Computational Network Toolkit” and compared it to Google’s Tensorflow.   Microsoft has now released a major upgrade of the software and rebranded it as part of the Microsoft Cognitive Toolkit.  This release is a major improvement over the initial release.  […]
              Kommentar zu Google stellt Machine-Learning-System TensorFlow Open Source zur Verfügung von Sonnet – eine Bibliothek zum Erstellen neuronaler Netzwerke        
    […] Was ist eigentlich TensorFlow? Die Antwort gibt’s hier. […]
              Senior AI Solution Architect - Innodata Labs - Remote        
    Our stack includes Python, TensorFlow, Hadoop, Messaging (RabbitMQ, Kafka), Docker / Kubernetes and cloud infrastructure (AWS, Google Cloud) as well as various...
    From Innodata Labs - Fri, 05 May 2017 23:23:37 GMT - View all Remote jobs
              Android에서 TensorFlow 실행하기        
    Google은 기계 학습을 구현하기 위해 Android에서 사용할 수있는 TensorFlow라는 라이브러리를 오픈 소스로 제공합니다. TensorFlow는 Google에서 제공하는 Machine Intelligence 용 오픈 소스 소프트웨어 라이브러리입니다. 인터넷을 많이 검색했지만 Android 용 TensorFlow를 만드는 간단한 방법이나 간단한 예제를 찾지 못했습니다. 알려진 정보를 토대로 잘 조합하여 빌드 할 수 있게 되었습니다. 이런 과정들을 다른사람들도 쉽게 이해 할 수 있도록 공유하기위해 … Android에서 TensorFlow 실행하기 더보기
              [آموزش] دانلود Udemy Python for Data Science and Machine Learning Bootcamp - آموزش پایتون برای علوم داده و یادگیری ماشین        

    دانلود Udemy Python for Data Science and Machine Learning Bootcamp - آموزش پایتون برای علوم داده و یادگیری ماشین

    علم داده‌ ها (Data Science)، مطالعاتی پیرامون استخراج دانش و آگاهی از مجموعه‌ای داده و اطلاعات است. هدف این علم، استخراج مفهوم از داده و تولید محصولات داده‌ محور است. به شاغلین در حوزه ی علم داده، داده پژوه (data scientist) می گویند. یکی از شاخه‌های وسیع و پرکاربرد هوش مصنوعی، یادگیری ماشینی (Machine learning) است که به تنظیم و اکتشاف شیوه‌ها و الگوریتم‌هایی می‌پردازد که بر اساس آنها رایانه‌ها و سامانه‌ها توانایی تعلم و یادگیری پیدا می‌کنند.یکی از لذت بخش ترین و جزو 10 تا از بهترین و پرطرفدارترین شغل های جهان علوم داده است. این شغل به طور ...

    مطالب مرتبط:



    دسته بندی: دانلود » آموزش » برنامه نویسی و طراحی وب
    برچسب ها: , , , , , , , , , , , , ,
    لینک های مفید: خرید کارت شارژ, شارژ مستقیم, پرداخت قبض, خرید آنتی ویروس, خرید لایسنس آنتی ویروس, تبلیغات در اینترنت, تبلیغات اینترنتی
    © حق مطلب و تصویر برای پی سی دانلود محفوظ است | لینک دائم
    همین حالا مشترک این پایگاه شوید!


              100 announcements (!) from Google Cloud Next '17        

    San Francisco — What a week! Google Cloud Next ‘17 has come to the end, but really, it’s just the beginning. We welcomed 10,000+ attendees including customers, partners, developers, IT leaders, engineers, press, analysts, cloud enthusiasts (and skeptics). Together we engaged in 3 days of keynotes, 200+ sessions, and 4 invitation-only summits. Hard to believe this was our first show as all of Google Cloud with GCP, G Suite, Chrome, Maps and Education. Thank you to all who were here with us in San Francisco this week, and we hope to see you next year.

    If you’re a fan of video highlights, we’ve got you covered. Check out our Day 1 keynote (in less than 4 minutes) and Day 2 keynote (in under 5!).

    One of the common refrains from customers and partners throughout the conference was “Wow, you’ve been busy. I can’t believe how many announcements you’ve had at Next!” So we decided to count all the announcements from across Google Cloud and in fact we had 100 (!) announcements this week.

    For the list lovers amongst you, we’ve compiled a handy-dandy run-down of our announcements from the past few days:

    100-announcements-15

    Google Cloud is excited to welcome two new acquisitions to the Google Cloud family this week, Kaggle and AppBridge.

    1. Kaggle - Kaggle is one of the world's largest communities of data scientists and machine learning enthusiasts. Kaggle and Google Cloud will continue to support machine learning training and deployment services in addition to offering the community the ability to store and query large datasets.

    2. AppBridge - Google Cloud acquired Vancouver-based AppBridge this week, which helps you migrate data from on-prem file servers into G Suite and Google Drive.

    100-announcements-4

    Google Cloud brings a suite of new security features to Google Cloud Platform and G Suite designed to help safeguard your company’s assets and prevent disruption to your business: 

    3. Identity-Aware Proxy (IAP) for Google Cloud Platform (Beta) - Identity-Aware Proxy lets you provide access to applications based on risk, rather than using a VPN. It provides secure application access from anywhere, restricts access by user, identity and group, deploys with integrated phishing resistant Security Key and is easier to setup than end-user VPN.

    4. Data Loss Prevention (DLP) for Google Cloud Platform (Beta) - Data Loss Prevention API lets you scan data for 40+ sensitive data types, and is used as part of DLP in Gmail and Drive. You can find and redact sensitive data stored in GCP, invigorate old applications with new sensitive data sensing “smarts” and use predefined detectors as well as customize your own.

    5. Key Management Service (KMS) for Google Cloud Platform (GA) - Key Management Service allows you to generate, use, rotate, and destroy symmetric encryption keys for use in the cloud.

    6. Security Key Enforcement (SKE) for Google Cloud Platform (GA) - Security Key Enforcement allows you to require security keys be used as the 2-Step verification factor for enhanced anti-phishing security whenever a GCP application is accessed.

    7. Vault for Google Drive (GA) - Google Vault is the eDiscovery and archiving solution for G Suite. Vault enables admins to easily manage their G Suite data lifecycle and search, preview and export the G Suite data in their domain. Vault for Drive enables full support for Google Drive content, including Team Drive files.

    8. Google-designed security chip, Titan - Google uses Titan to establish hardware root of trust, allowing us to securely identify and authenticate legitimate access at the hardware level. Titan includes a hardware random number generator, performs cryptographic operations in the isolated memory, and has a dedicated secure processor (on-chip).

    100-announcements-7

    New GCP data analytics products and services help organizations solve business problems with data, rather than spending time and resources building, integrating and managing the underlying infrastructure:

    9. BigQuery Data Transfer Service (Private Beta) - BigQuery Data Transfer Service makes it easy for users to quickly get value from all their Google-managed advertising datasets. With just a few clicks, marketing analysts can schedule data imports from Google Adwords, DoubleClick Campaign Manager, DoubleClick for Publishers and YouTube Content and Channel Owner reports.

    10. Cloud Dataprep (Private Beta) - Cloud Dataprep is a new managed data service, built in collaboration with Trifacta, that makes it faster and easier for BigQuery end-users to visually explore and prepare data for analysis without the need for dedicated data engineer resources.

    11. New Commercial Datasets - Businesses often look for datasets (public or commercial) outside their organizational boundaries. Commercial datasets offered include financial market data from Xignite, residential real-estate valuations (historical and projected) from HouseCanary, predictions for when a house will go on sale from Remine, historical weather data from AccuWeather, and news archives from Dow Jones, all immediately ready for use in BigQuery (with more to come as new partners join the program).

    12. Python for Google Cloud Dataflow in GA - Cloud Dataflow is a fully managed data processing service supporting both batch and stream execution of pipelines. Until recently, these benefits have been available solely to Java developers. Now there’s a Python SDK for Cloud Dataflow in GA.

    13. Stackdriver Monitoring for Cloud Dataflow (Beta) - We’ve integrated Cloud Dataflow with Stackdriver Monitoring so that you can access and analyze Cloud Dataflow job metrics and create alerts for specific Dataflow job conditions.

    14. Google Cloud Datalab in GA - This interactive data science workflow tool makes it easy to do iterative model and data analysis in a Jupyter notebook-based environment using standard SQL, Python and shell commands.

    15. Cloud Dataproc updates - Our fully managed service for running Apache Spark, Flink and Hadoop pipelines has new support for restarting failed jobs (including automatic restart as needed) in beta, the ability to create single-node clusters for lightweight sandbox development, in beta, GPU support, and the cloud labels feature, for more flexibility managing your Dataproc resources, is now GA.

    100-announcements-9

    New GCP databases and database features round out a platform on which developers can build great applications across a spectrum of use cases:

    16. Cloud SQL for Postgre SQL (Beta) - Cloud SQL for PostgreSQL implements the same design principles currently reflected in Cloud SQL for MySQL, namely, the ability to securely store and connect to your relational data via open standards.

    17. Microsoft SQL Server Enterprise (GA) - Available on Google Compute Engine, plus support for Windows Server Failover Clustering (WSFC) and SQL Server AlwaysOn Availability (GA).

    18. Cloud SQL for MySQL improvements - Increased performance for demanding workloads via 32-core instances with up to 208GB of RAM, and central management of resources via Identity and Access Management (IAM) controls.

    19. Cloud Spanner - Launched a month ago, but still, it would be remiss not to mention it because, hello, it’s Cloud Spanner! The industry’s first horizontally scalable, globally consistent, relational database service.

    20. SSD persistent-disk performance improvements - SSD persistent disks now have increased throughput and IOPS performance, which are particularly beneficial for database and analytics workloads. Read these docs for complete details about persistent-disk performance.

    21. Federated query on Cloud Bigtable - We’ve extended BigQuery’s reach to query data inside Cloud Bigtable, the NoSQL database service for massive analytic or operational workloads that require low latency and high throughput (particularly common in Financial Services and IoT use cases).

    100-announcements-11

    New GCP Cloud Machine Learning services bolster our efforts to make machine learning accessible to organizations of all sizes and sophistication:

    22.  Cloud Machine Learning Engine (GA) - Cloud ML Engine, now generally available, is for organizations that want to train and deploy their own models into production in the cloud.

    23. Cloud Video Intelligence API (Private Beta) - A first of its kind, Cloud Video Intelligence API lets developers easily search and discover video content by providing information about entities (nouns such as “dog,” “flower”, or “human” or verbs such as “run,” “swim,” or “fly”) inside video content.

    24. Cloud Vision API (GA) - Cloud Vision API reaches GA and offers new capabilities for enterprises and partners to classify a more diverse set of images. The API can now recognize millions of entities from Google’s Knowledge Graph and offers enhanced OCR capabilities that can extract text from scans of text-heavy documents such as legal contracts or research papers or books.

    25. Machine learning Advanced Solution Lab (ASL) - ASL provides dedicated facilities for our customers to directly collaborate with Google’s machine-learning experts to apply ML to their most pressing challenges.

    26. Cloud Jobs API - A powerful aid to job search and discovery, Cloud Jobs API now has new features such as Commute Search, which will return relevant jobs based on desired commute time and preferred mode of transportation.

    27. Machine Learning Startup Competition - We announced a Machine Learning Startup Competition in collaboration with venture capital firms Data Collective and Emergence Capital, and with additional support from a16z, Greylock Partners, GV, Kleiner Perkins Caufield & Byers and Sequoia Capital.

    100-announcements-10

    New GCP pricing continues our intention to create customer-friendly pricing that’s as smart as our products; and support services that are geared towards meeting our customers where they are:

    28. Compute Engine price cuts - Continuing our history of pricing leadership, we’ve cut Google Compute Engine prices by up to 8%.

    29. Committed Use Discounts - With Committed Use Discounts, customers can receive a discount of up to 57% off our list price, in exchange for a one or three year purchase commitment paid monthly, with no upfront costs.

    30. Free trial extended to 12 months - We’ve extended our free trial from 60 days to 12 months, allowing you to use your $300 credit across all GCP services and APIs, at your own pace and schedule. Plus, we’re introduced new Always Free products -- non-expiring usage limits that you can use to test and develop applications at no cost. Visit the Google Cloud Platform Free Tier page for details.

    31. Engineering Support - Our new Engineering Support offering is a role-based subscription model that allows us to match engineer to engineer, to meet you where your business is, no matter what stage of development you’re in. It has 3 tiers:

    • Development engineering support - ideal for developers or QA engineers that can manage with a response within four to eight business hours, priced at $100/user per month.
    • Production engineering support provides a one-hour response time for critical issues at $250/user per month.
    • On-call engineering support pages a Google engineer and delivers a 15-minute response time 24x7 for critical issues at $1,500/user per month.

    32. Cloud.google.com/community site - Google Cloud Platform Community is a new site to learn, connect and share with other people like you, who are interested in GCP. You can follow along with tutorials or submit one yourself, find meetups in your area, and learn about community resources for GCP support, open source projects and more.

    100-announcements-8

    New GCP developer platforms and tools reinforce our commitment to openness and choice and giving you what you need to move fast and focus on great code.

    33. Google AppEngine Flex (GA) - We announced a major expansion of our popular App Engine platform to new developer communities that emphasizes openness, developer choice, and application portability.

    34. Cloud Functions (Beta) - Google Cloud Functions has launched into public beta. It is a serverless environment for creating event-driven applications and microservices, letting you build and connect cloud services with code.

    35. Firebase integration with GCP (GA) - Firebase Storage is now Google Cloud Storage for Firebase and adds support for multiple buckets, support for linking to existing buckets, and integrates with Google Cloud Functions.

    36. Cloud Container Builder - Cloud Container Builder is a standalone tool that lets you build your Docker containers on GCP regardless of deployment environment. It’s a fast, reliable, and consistent way to package your software into containers as part of an automated workflow.

    37. Community Tutorials (Beta)  - With community tutorials, anyone can now submit or request a technical how-to for Google Cloud Platform.

    100-announcements-9

    Secure, global and high-performance, we’ve built our cloud for the long haul. This week we announced a slew of new infrastructure updates. 

    38. New data center region: California - This new GCP region delivers lower latency for customers on the West Coast of the U.S. and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

    39. New data center region: Montreal - This new GCP region delivers lower latency for customers in Canada and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

    40. New data center region: Netherlands - This new GCP region delivers lower latency for customers in Western Europe and adjacent geographic areas. Like other Google Cloud regions, it will feature a minimum of three zones, benefit from Google’s global, private fibre network, and offer a complement of GCP services.

    41. Google Container Engine - Managed Nodes - Google Container Engine (GKE) has added Automated Monitoring and Repair of your GKE nodes, letting you focus on your applications while Google ensures your cluster is available and up-to-date.

    42. 64 Core machines + more memory - We have doubled the number of vCPUs you can run in an instance from 32 to 64 and up to 416GB of memory per instance.

    43. Internal Load balancing (GA) - Internal Load Balancing, now GA, lets you run and scale your services behind a private load balancing IP address which is accessible only to your internal instances, not the internet.

    44. Cross-Project Networking (Beta) - Cross-Project Networking (XPN), now in beta, is a virtual network that provides a common network across several Google Cloud Platform projects, enabling simple multi-tenant deployments.

    100-announcements-16

    In the past year, we’ve launched 300+ features and updates for G Suite and this week we announced our next generation of collaboration and communication tools.

    45. Team Drives (GA for G Suite Business, Education and Enterprise customers) - Team Drives help teams simply and securely manage permissions, ownership and file access for an organization within Google Drive.

    46. Drive File Stream (EAP) - Drive File Stream is a way to quickly stream files directly from the cloud to your computer With Drive File Steam, company data can be accessed directly from your laptop, even if you don’t have much space on your hard drive.

    47. Google Vault for Drive (GA for G Suite Business, Education and Enterprise customers) - Google Vault for Drive now gives admins the governance controls they need to manage and secure all of their files, including employee Drives and Team Drives. Google Vault for Drive also lets admins set retention policies that automatically keep what’s needed and delete what’s not.

    48. Quick Access in Team Drives (GA) - powered by Google’s machine intelligence, Quick Access helps to surface the right information for employees at the right time within Google Drive. Quick Access now works with Team Drives on iOS and Android devices, and is coming soon to the web.

    49. Hangouts Meet (GA to existing customers) - Hangouts Meet is a new video meeting experience built on the Hangouts that can run 30-person video conferences without accounts, plugins or downloads. For G Suite Enterprise customers, each call comes with a dedicated dial-in phone number so that team members on the road can join meetings without wifi or data issues.

    50. Hangouts Chat (EAP) - Hangouts Chat is an intelligent communication app in Hangouts with dedicated, virtual rooms that connect cross-functional enterprise teams. Hangouts Chat integrates with G Suite apps like Drive and Docs, as well as photos, videos and other third-party enterprise apps.

    51. @meet - @meet is an intelligent bot built on top of the Hangouts platform that uses natural language processing and machine learning to automatically schedule meetings for your team with Hangouts Meet and Google Calendar.

    52. Gmail Add-ons for G Suite (Developer Preview) - Gmail Add-ons provide a way to surface the functionality of your app or service directly in Gmail. With Add-ons, developers only build their integration once, and it runs natively in Gmail on web, Android and iOS.

    53. Edit Opportunities in Google Sheets - with Edit Opportunities in Google Sheets, sales reps can sync a Salesforce Opportunity List View to Sheets to bulk edit data and changes are synced automatically to Salesforce, no upload required.

    54. Jamboard - Our whiteboard in the cloud goes GA in May! Jamboard merges the worlds of physical and digital creativity. It’s real time collaboration on a brilliant scale, whether your team is together in the conference room or spread all over the world.

    100-announcements-17

    Building on the momentum from a growing number of businesses using Chrome digital signage and kiosks, we added new management tools and APIs in addition to introducing support for Android Kiosk apps on supported Chrome devices. 

    55. Android Kiosk Apps for Chrome - Android Kiosk for Chrome lets users manage and deploy Chrome digital signage and kiosks for both web and Android apps. And with Public Session Kiosks, IT admins can now add a number of Chrome packaged apps alongside hosted apps.

    56. Chrome Kiosk Management Free trial - This free trial gives customers an easy way to test out Chrome for signage and kiosk deployments.

    57. Chrome Device Management (CDM) APIs for Kiosks - These APIs offer programmatic access to various Kiosk policies. IT admins can schedule a device reboot through the new APIs and integrate that functionality directly in a third- party console.

    58. Chrome Stability API - This new API allows Kiosk app developers to improve the reliability of the application and the system.

    100-announcements-2

    Attendees at Google Cloud Next ‘17 heard stories from many of our valued customers:

    59. Colgate - Colgate-Palmolive partnered with Google Cloud and SAP to bring thousands of employees together through G Suite collaboration and productivity tools. The company deployed G Suite to 28,000 employees in less than six months.

    60. Disney Consumer Products & Interactive (DCPI) - DCPI is on target to migrate out of its legacy infrastructure this year, and is leveraging machine learning to power next generation guest experiences.

    61. eBay - eBay uses Google Cloud technologies including Google Container Engine, Machine Learning and AI for its ShopBot, a personal shopping bot on Facebook Messenger.

    62. HSBC - HSBC is one of the world's largest financial and banking institutions and making a large investment in transforming its global IT. The company is working closely with Google to deploy Cloud DataFlow, BigQuery and other data services to power critical proof of concept projects.

    63. LUSH - LUSH migrated its global e-commerce site from AWS to GCP in less than six weeks, significantly improving the reliability and stability of its site. LUSH benefits from GCP’s ability to scale as transaction volume surges, which is critical for a retail business. In addition, Google's commitment to renewable energy sources aligns with LUSH's ethical principles.

    64. Oden Technologies - Oden was part of Google Cloud’s startup program, and switched its entire platform to GCP from AWS. GCP offers Oden the ability to reliably scale while keeping costs low, perform under heavy loads and consistently delivers sophisticated features including machine learning and data analytics.

    65. Planet - Planet migrated to GCP in February, looking to accelerate their workloads and leverage Google Cloud for several key advantages: price stability and predictability, custom instances, first-class Kubernetes support, and Machine Learning technology. Planet also announced the beta release of their Explorer platform.

    66. Schlumberger - Schlumberger is making a critical investment in the cloud, turning to GCP to enable high-performance computing, remote visualization and development velocity. GCP is helping Schlumberger deliver innovative products and services to its customers by using HPC to scale data processing, workflow and advanced algorithms.

    67. The Home Depot - The Home Depot collaborated with GCP’s Customer Reliability Engineering team to migrate HomeDepot.com to the cloud in time for Black Friday and Cyber Monday. Moving to GCP has allowed the company to better manage huge traffic spikes at peak shopping times throughout the year.

    68. Verizon - Verizon is deploying G Suite to more than 150,000 of its employees, allowing for collaboration and flexibility in the workplace while maintaining security and compliance standards. Verizon and Google Cloud have been working together for more than a year to bring simple and secure productivity solutions to Verizon’s workforce.

    100-announcements-3

    We brought together Google Cloud partners from our growing ecosystem across G Suite, GCP, Maps, Devices and Education. Our partnering philosophy is driven by a set of principles that emphasize openness, innovation, fairness, transparency and shared success in the cloud market. Here are some of our partners who were out in force at the show:

    69. Accenture - Accenture announced that it has designed a mobility solution for Rentokil, a global pest control company, built in collaboration with Google as part of the partnership announced at Horizon in September.

    70. Alooma - Alooma announced the integration of the Alooma service with Google Cloud SQL and BigQuery.

    71. Authorized Training Partner Program - To help companies scale their training offerings more quickly, and to enable Google to add other training partners to the ecosystem, we are introducing a new track within our partner program to support their unique offerings and needs.

    72. Check Point - Check Point® Software Technologies announced Check Point vSEC for Google Cloud Platform, delivering advanced security integrated with GCP as well as their joining of the Google Cloud Technology Partner Program.

    73. CloudEndure - We’re collaborating with CloudEndure to offer a no cost, self-service migration tool for Google Cloud Platform (GCP) customers.

    74. Coursera - Coursera announced that it is collaborating with Google Cloud Platform to provide an extensive range of Google Cloud training course. To celebrate this announcement  Coursera is offering all NEXT attendees a 100% discount for the GCP fundamentals class.

    75. DocuSign - DocuSign announced deeper integrations with Google Docs.

    76. Egnyte - Egnyte announced an enhanced integration with Google Docs that will allow our joint customers to create, edit, and store Google Docs, Sheets and Slides files right from within the Egnyte Connect.

    77. Google Cloud Global Partner Awards - We recognized 12 Google Cloud partners that demonstrated strong customer success and solution innovation over the past year: Accenture, Pivotal, LumApps, Slack, Looker, Palo Alto Networks, Virtru, SoftBank, DoIT, Snowdrop Solutions, CDW Corporation, and SYNNEX Corporation.

    78. iCharts - iCharts announced additional support for several GCP databases, free pivot tables for current Google BigQuery users, and a new product dubbed “iCharts for SaaS.”

    79. Intel - In addition to the progress with Skylake, Intel and Google Cloud launched several technology initiatives and market education efforts covering IoT, Kubernetes and TensorFlow, including optimizations, a developer program and tool kits.

    80. Intuit - Intuit announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

    81. Liftigniter - Liftigniter is a member of Google Cloud’s startup program and focused on machine learning personalization using predictive analytics to improve CTR on web and in-app.

    82. Looker - Looker launched a suite of Looker Blocks, compatible with Google BigQuery Data Transfer Service, designed to give marketers the tools to enhance analysis of their critical data.

    83. Low interest loans for partners - To help Premier Partners grow their teams, Google announced that capital investment are available to qualified partners in the form of low interest loans.

    84. MicroStrategy - MicroStrategy announced an integration with Google Cloud SQL for PostgreSQL and Google Cloud SQL for MySQL.

    85. New incentives to accelerate partner growth - We are increasing our investments in multiple existing and new incentive programs; including, low interest loans to help Premier Partners grow their teams, increasing co-funding to accelerate deals, and expanding our rebate programs.

    86. Orbitera Test Drives for GCP Partners - Test Drives allow customers to try partners’ software and generate high quality leads that can be passed directly to the partners’ sales teams. Google is offering Premier Cloud Partners one year of free Test Drives on Orbitera.

    87. Partner specializations - Partners demonstrating strong customer success and technical proficiency in certain solution areas will now qualify to apply for a specialization. We’re launching specializations in application development, data analytics, machine learning and infrastructure.

    88. Pivotal - GCP announced Pivotal as our first CRE technology partner. CRE technology partners will work hand-in-hand with Google to thoroughly review their solutions and implement changes to address identified risks to reliability.

    89. ProsperWorks - ProsperWorks announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

    90. Qwiklabs - This recent acquisition will provide Authorized Training Partners the ability to offer hands-on labs and comprehensive courses developed by Google experts to our customers.

    91. Rackspace - Rackspace announced a strategic relationship with Google Cloud to become its first managed services support partner for GCP, with plans to collaborate on a new managed services offering for GCP customers set to launch later this year.

    92. Rocket.Chat - Rocket.Chat, a member of Google Cloud’s startup program, is adding a number of new product integrations with GCP including Autotranslate via Translate API, integration with Vision API to screen for inappropriate content, integration to NLP API to perform sentiment analysis on public channels, integration with GSuite for authentication and a full move of back-end storage to Google Cloud Storage.

    93. Salesforce - Salesforce announced Gmail Add-Ons, which are designed to integrate custom workflows into Gmail based on the context of a given email.

    94. SAP - This strategic partnership includes certification of SAP HANA on GCP, new G Suite integrations and future collaboration on building machine learning features into intelligent applications like conversational apps that guide users through complex workflows and transactions.

    95. Smyte - Smyte participated in the Google Cloud startup program and protects millions of actions a day on websites and mobile applications. Smyte recently moved from self-hosted Kubernetes to Google Container Engine (GKE).

    96. Veritas - Veritas expanded its partnership with Google Cloud to provide joint customers with 360 Data Management capabilities. The partnership will help reduce data storage costs, increase compliance and eDiscovery readiness and accelerate the customer’s journey to Google Cloud Platform.

    97. VMware Airwatch - Airwatch provides enterprise mobility management solutions for Android and continues to drive the Google Device ecosystem to enterprise customers.

    98. Windows Partner Program- We’re working with top systems integrators in the Windows community to help GCP customers take full advantage of Windows and .NET apps and services on our platform.

    99. Xplenty - Xplenty announced the addition of two new services from Google Cloud into their available integrations: Google Cloud Spanner and Google Cloud SQL for PostgreSQL.

    100. Zoomdata - Zoomdata announced support for Google’s Cloud Spanner and PostgreSQL on GCP, as well as enhancements to the existing Zoomdata Smart Connector for Google BigQuery. With these new capabilities Zoomdata offers deeply integrated and optimized support for Google Cloud Platform’s Cloud Spanner, PostgreSQL, Google BigQuery, and Cloud DataProc services.

    We’re thrilled to have so many new products and partners that can help all of our customers grow. And as our final announcement for Google Cloud Next ’17 — please save the date for Next 2018: June 4–6 in San Francisco.

    I guess that makes it 101. :-)



              dataaspirant-june2016-newsletter        

        Blog Posts: [1] Top 10 Machine Learning Algorithms [2] Explaining Deep Learning  [3] Wide & Deep Learning: Better Together with TensorFlow [4] Building intelligent applications with deep learning and TensorFlow [5] k-Nearest Neighbors (k-NN) [6] Hierarchical clustering [7] Access your data in Amazon Redshift and PostgreSQL with Python and R [8] Why E-Commerce Can’t Afford to Ignore Machine
    + Read More

    The post dataaspirant-june2016-newsletter appeared first on Dataaspirant.


              Get into the flow        

    In Flow-Based Programming, programs are modeled as data flowing between independent processing units. Who would not think of channels and goroutines as a natural analogy?

    Not too far away in the future, my new online Go course goes live. Get notified via appliedgo.com.

    As trivial as this may sound, all software is about processing data. Yet, when you look at code written in a “traditional” programming language, the actual data flow is not readily visible. Instead, what you mainly see are just the control structures. The actual data flow only happens to occur at runtime, as a consequence from the control structures.

    Flow-Based Programming (FBP) turns the view on code and data upside down. Here, the data flow is the first thing you look at; it is the main principle that defines the structure of your application. Processing of data happens within many small nodes that sit between the endpoints of data pipelines.

    At this level, the processing nodes are just black boxes in a graphic flow diagram. The actual code hides within these boxes.

    Flow-based programming and concurrency

    Looking at an FBP diagram immediately raises two thoughts.

    First, the data flow model is inherently concurrent. Data streams are independent of each other, and so are the nodes. Looks like optimal separation of concerns.

    Second, a data flow looks darned close to channels and goroutines!

    Do we have a natural match here? It seems tempting to build an FBP model directly on Go’s built-in concurrency concepts.

    In fact, this has been done already.

    Go FBP libraries

    A quick search on GitHub reveals a handful of Go-based FBP projects, which I list here together with their own descriptions.

    trustmaster/goflow

    “This is quite a minimalistic implementation of Flow-based programming and several other concurrent models in Go programming language that aims at designing applications as graphs of components which react to data that flows through the graph.”

    scipipe/scipipe

    “SciPipe is an experimental library for writing scientific Workflows in vanilla Go(lang). The architecture of SciPipe is based on an flow-based programming like pattern in pure Go (…)

    flowbase/flowbase

    “A Flow-based Programming (FBP) micro-framework for Go (Golang).”

    ryanpeach/goflow

    “A LabVIEW and TensorFlow Inspired Graph-Based Programming Environment for AI handled within the Go Programming Language.”

    7ing/go-flow

    “A cancellable concurrent pattern for Go programming language”

    cascades-fbp/cascades

    “Language-Agnostic Programming Framework for Data-Driven Applications”

    themalkolm/go-fbp

    “Go implementation of Flow-based programming.”

    (The last one actually relies on input from a graphical FBP editor (DrawFBP) that it turns into code stubs.)

    To be fair, some of these libs seem not actively maintained anymore. I included them anyway as there is no single true approach to this, and each of these libs shows a different approach and focuses on different aspects.

    I also most certainly left out a few FBP libs that I failed to find in the short time of researching this topic, so feel free to do some more research on your own.

    A simple FBP flow

    For today’s code, I picked the first of the libraries above, trustmaster/goflow. It provides a quite readable syntax and comes with detailed documentation. (On the flipside, goflow uses quite some reflection inside, which some of you might frown upon.)

    Our sample code is an incarnation of the schematic FBP graph in the initial animation. Let’s turn the abstract nodes and data items into someting more tangible. For example, we could feed the network with sentences and let one node count the words in each sentence and the other all letters. The final node then prints the results.

    The code

    First, we define the nodes. Each node is a struct with an embedded flow.Component and input and output channels (at least one of each kind, except for a sink node that only has input channels).

    Nodes can act on input by functions that are named after the input channels. For example, if an input channel is named “Strings”, the function that triggers on new input is called “OnStrings” by convention.

    We define these nodes:

    • A splitter that takes the input and copies it to two outputs.
    • A word counter that counts the words (i.e., non-whitespace content) of a sentence.
    • A letter counter that counts the letters (a-z and A-Z) of a sentence.
    • A printer that prints its input.

    None of these nodes knows about any of the other nodes, and does not need to.

    package main
    
    import (
    	"fmt"
    	"regexp"
    	"strings"
    
    	"github.com/trustmaster/goflow"
    )
    

    Our two counter nodes (see below) send their results asynchronously to the printer node. To distinguish between the outputs of the two counters, we attach a tag to each count. (Yes, sending just a string including the count would be easier but also more boring. The splitter already sends strings, so let’s try something different here.)

    type count struct {
    	tag   string
    	count int
    }
    

    The splitter receives strings and copies each one to its two output ports.

    type splitter struct {
    	flow.Component
    
    	In         <-chan string
    	Out1, Out2 chan<- string
    }
    

    OnIn dispatches the input string to the two output ports.

    func (t *splitter) OnIn(s string) {
    	t.Out1 <- s
    	t.Out2 <- s
    }
    

    WordCounter is a goflow component that counts the words in a string.

    type wordCounter struct {
    

    Embed flow functionality.

    	flow.Component
    

    The input port receives strings that (should) contain words.

    	Sentence <-chan string
    

    The output port sends the word count as integers.

    	Count chan<- *count
    }
    

    OnSentence triggers on new input from the Sentence port. It counts the number of words in the sentence.

    func (wc *wordCounter) OnSentence(sentence string) {
    	wc.Count <- &count{"Words", len(strings.Split(sentence, " "))}
    }
    

    letterCounter is a goflow component that counts the letters in a string.

    type letterCounter struct {
    	flow.Component
    	Sentence <-chan string
    

    The output port sends the letter count as integers.

    	Count chan<- *count
    

    To identify letters, we use a simple regular expression.

    	re *regexp.Regexp
    }
    

    OnSentence triggers on new input from the Sentence port. It counts the number of words in the sentence.

    func (lc *letterCounter) OnSentence(sentence string) {
    	lc.Count <- &count{"Letters", len(lc.re.FindAllString(sentence, -1))}
    }
    

    An Init method allows to initialize a component. Here we use it to run the expensive MustCompile method once, rather than every time OnSentence is called.

    func (lc *letterCounter) Init() {
    	lc.re = regexp.MustCompile("[a-zA-Z]")
    }
    

    A printer is a “sink” with no output channel. It prints the input to the console.

    type printer struct {
    	flow.Component
    	Line <-chan *count // inport
    }
    

    OnLine prints a count.

    func (p *printer) OnLine(c *count) {
    	fmt.Println(c.tag+":", c.count)
    }
    

    CounterNet represents the complete network of nodes and data pipelines.

    type counterNet struct {
    	flow.Graph
    }
    

    Assembling the network

    With the nodes in place, we can go foward and create the complete network, adding and connecting all the nodes.

    Construct the network graph.

    func NewCounterNet() *counterNet {
    	n := &counterNet{}
    

    Initialize the net.

    	n.InitGraphState()
    

    Add nodes to the net. (I derived from the documentation by using &{} instead of new.) Each node gets a name assigned that is used later when connecting the nodes.

    	n.Add(&splitter{}, "splitter")
    	n.Add(&wordCounter{}, "wordCounter")
    	n.Add(&letterCounter{}, "letterCounter")
    	n.Add(&printer{}, "printer")
    

    Connect the nodes. The parameters are: Sending node, sending port, receiving node, and receiving port.

    	n.Connect("splitter", "Out1", "wordCounter", "Sentence")
    	n.Connect("splitter", "Out2", "letterCounter", "Sentence")
    	n.Connect("wordCounter", "Count", "printer", "Line")
    	n.Connect("letterCounter", "Count", "printer", "Line")
    

    Our net has 1 input port mapped to splitter.In.

    	n.MapInPort("In", "splitter", "In")
    	return n
    }
    

    Launching the network

    Finally, we only need to activate the network, create an input port, and start feeding it with selected bits of wisdom.

    func main() {
    

    Create the network.

    	net := NewCounterNet()
    

    We create a channel as the input port of the network.

    	in := make(chan string)
    	net.SetInPort("In", in)
    

    Start the net.

    	flow.RunNet(net)
    

    Now we can send some text and see what happens. This is as easy as sending text to the input channel. (All aphorisms by Oscar Wilde.)

    	in <- "I never put off till tomorrow what I can do the day after."
    	in <- "Fashion is a form of ugliness so intolerable that we have to alter it every six months."
    	in <- "Life is too important to be taken seriously."
    

    Closing the input channel shuts the network down.

    	close(in)
    

    Wait until the network has shut down.

    	<-net.Wait()
    }
    

    How to get and run the code

    Step 1: go get the code. Note the -d flag that prevents auto-installing the binary into $GOPATH/bin.

    go get -d github.com/appliedgo/flow
    

    Step 2: cd to the source code directory.

    cd $GOPATH/src/github.com/appliedgo/flow
    

    Step 3. Run the binary.

    go run ./flow
    

    The output should look like:

    Letters: 45
    Words: 17
    Words: 13
    Letters: 36
    Words: 8
    Letters: 70
    

    The unordered output shows that the nodes are indeed running asynchronously. Homework assignment: Add more info to the count struct to allow the printer node grouping the output by input sentence.

    Conclusions

    Still, although we were able to nicely describe our nodes and the connections between them, the resulting code is far from representing an intuitive view on the flow of data within the program. This should not be surprising. A textual representation rarely matches up with the intuitiveness of a graphic representation.

    So where is the visual flow diagram editor, you ask?

    There are indeed some options.

    Shenzhen Go

    Just recently, an experimental visual Go environment has been presented to the public - Shenzhen Go. (Careful though - “experimental” means exactly this.)

    Shenzhen Word Counter

    Some nodes contain configurable standard actions, others contain Go code that reads from input channels and writes to output channels (unless the node is a sink).

    Shenzhen Print summary node

    go-fbp and DrawFBP

    If you want a graphic editor now and don’t want to wait until Shenzhen Go is production ready, have a look at themalkolm/go-fbp. This project generates Go code from the output of a graphical FBP editor called DrawFBP (a Java app). (Disclaimer: I have tested neither go-fbp nor DrawFBP.)

    Wikipedia: Flow-based programming

    John Paul Morrison: FBP flowbased.org

    Gopheracademy: Patterns for composable concurrent pipelines in Go

    Gopheracademy: Composable Pipelines Improved

    Happy coding!


              c't uplink 17.1: Google I/O, KI macht Bilder schön, Intel Optane        
    Das Mobilressort hat diese Woche mächtig zu tun. Auf der Entwicklerkonferenz Google I/O gibt's Dutzende Workshops, Präsentationen und Ankündigungen zu Android und vielen anderen Produkte von Google. Während die Beta von Android O und Googles VR-Pläne uns noch nicht so richtig vom Hocker hauen, finden wir dieses Jahr die News zu künstlicher Intelligenz besonders spannend. Apropos künstliche Intelligenz: c't-Redakteur Johannes Merkert hat mit Hilfe des KI-Frameworks TensorFlow ein neuronales Netz so trainiert, dass es Bilder schöner skaliert. Zum Glück ist er mit dem Code nicht zu Adobe gegangen, sondern hat einen c't-Artikel dazu geschrieben. Mit uns diskutiert er darüber, wie man jetzt schon Bilder mit KI verbessern kann, und was in Zukunft noch alles auf uns zukommt. Zum Schluss erklärt uns Speicher-Experte Lutz Labs, was es mit dem designierten Flash-Nachfolger 3D XPoint und Intel Optane auf sich hat. Er hat ein erstes Optane-Modul zum Testen im c't-Labor gehabt. Mit dabei: Jörg Wirtgen, Johannes Merkert, Achim Barczok und Lutz Labs Die c't 11/17 gibt's am Kiosk, im heise Shop und digital in der c't-App für iOS und Android. Alle früheren Episoden unseres Podcasts gibt es unter www.ct.de/uplink.
              Keras中自定义复杂的loss函数        
    Keras是一个搭积木式的深度学习框架,用它可以很方便且直观地搭建一些常见的深度学习模型。在tensorflow出来之前,Keras就已经几乎是当时最火的深度学习框架,以theano为后端,而如...
              Gpu Benchmark: GTX 1080 vs. Titan X        

    GTX 1080 vs Titan X

    Per i videogamers professionisti le sigle GTX 1080 e Titan X rappresentano il top di gamma tra le schede video (GPU) attualmente presenti sul mercato e sono, quindi, la meta da raggiungere per ogni videogiocatore che si rispetti. Entrambe le schede video sono prodotte da Nvidia e sono tutto quello che si vorrebbe da una scheda video: alte performance e configurazione hardware elevata.

    Caratteristiche Tecniche

    Come dicevamo la configurazione hardware di queste schede video è di altissimo livello e, come vedremo più avanti, nonostante le differenze tra le due GPU non siano elevate, solo una di esse è risultata nettamente la più performante dopo tutti i test a cui è stata sottoposta.
    Gtx 1080 Titan X
    Core CUDA  2560 3584
    Clock di Base  1.6 Ghz 1.42 Ghz
    Clock Boost  1.73 Ghz 1.53 Ghz
    Memoria  8 Gb DDR5X 12 GB DDR5X
    Banda di Memoria  320 GB/s 480 GB/s
    TDP  180 W 250 W
    Processore  GP104 GP102
    Transistor  7.2bn 12bn

    Prezzo

    I prezzi di queste schede video rispecchiano le performance che sono in grado di raggiungere e i prezzi attuali sono molto elevati. Si parla di circa 1200 Euro per la Titan X mentre la Gtx 1080 costa circa 800 Euro.

    Offerte Amazon

    Abbiamo selezionato per voi due offerte interessanti da Amazon per questi due gioiellini di tecnologia. Se siete dei fan dei videogame e vi piace giocare solo alle massime risoluzioni e con il massimo della fluidità, valutare l'acquisto di una di queste schede potrebbe rappresentare il regalo perfetto da mettere sotto l'albero questo Natale:
     

    Benchmark

    Arriviamo ora al succo di quest'articolo ossia quali sono i risultati che queste due schede hanno fatto registrare durante i vari Benchmark per GPU. Già da questa estate era iniziato il fermento legato ai risultati dei test in quanto fino ad allora gli unici dati conosciuti per la GTX 1080 e la Titan X erano solo i dati ottenuti in laboratorio e diffusi dalla stessa Nvidia. I numeri erano interessanti ma chi segue queste notizie sa che solitamente i risultati registrati si discostano da quelli dichiarati dal produttore pertanto non appena è stato possibile mettere le mani su queste due schede video Nvidia, diversi portali online di hardware le hanno messe a confronto con il classico test 3d Mark. I risultati del test 3D Mark effettuati sulle due schede video sono consultabili a questo url e di seguito vi riportiamo la classifica. 3D Mark: Titan X vs GTX 1080 Questo tipo di confronto va più che bene per determinare quale scheda video sarà più performante in ambito videoludico ma non tutti sanno che le schede video più performanti come la Titan X e la GTX 1080 hanno applicazione anche in alcune nuove discipline come il Machine Learning o, nello specifico, il Deep Learning. Caso ha voluto che provando a cercare sul web  qualche notizia a riguardo abbiamo scoperto che una società italiana che si occupa di queste discipline, la Add-For di Torino, ha recentemente pubblicato il proprio studio sulle performance di queste due schede video applicate ad algoritmi di Deep Learning e testate con diverse librerie quali TensorFlow, Caffè e Neon. I risultati ottenuti durante questi Deep Learning Benchmarks hanno dato risultati leggermente diversi, eleggendo come regina indiscussa delle schede video la Titan X. Sicuramente si tratta di due schede video con altissime capacità e in grado di fare la gioia di qualsiasi videogiocatore. Se oltre all'aspetto ludico vi interessano anche applicazioni più professionali come nel caso del Deep Learning, allora la Titan X è la scelta ottimale! Prossimamente la Add-for rilascerà nuovi benchmark di schede video quali la Nvidia Tesla K40 e K80 che saranno installate sui performanti sistemi HPC.
              é–¢è¥¿ãƒ¢ãƒã‚¤ãƒ«ã‚¢ãƒ—リ研究会 #関モバ 第21回 に参加        

    関西モバイルアプリ研究会(通称 関モバ)の第21回に参加してきました。

    kanmoba.connpass.com

    会場

    今回の会場も、このブログなどでもお世話になっている 株式会社はてな さんの京都オフィス!
    本当によくお世話になってます。

    自分で発表した内容

    speakerdeck.com
    補足資料

    うるう秒の挿入まであと5日間程度!ということで、うるう秒特集を組みました。
    前回のうるう秒は2015年7月1日(日本時間)だったのですが、その直前には勤務先の社内勉強会で似たような話をしていました。その時の内容を関モバ向けにアレンジしたのが今回の内容です。

    聞いた話

    全部メモできたわけではないですが、だいたいこんな内容でした。

    最後に紹介されていたプッシュ通知テスト用ツール、良さげでした。
    GitHub - noodlewerk/NWPusher: OS X and iOS application and framework to play with the Apple Push Notification service (APNs)


    おまけ1


    これ、自分のマシンに突っ込んでるファイアウォールが原因になっているような気がしてきました。

    おまけ2

    懇親会では、ずっと日付の話をしていました。
    祝日法(国民の祝日に関する法律)、最高に萌えます。
    第二条に登場する「春分日」と「秋分日」は、懇親会において総務省が決定し公表するかのような発言をしていましたが、調べてみるとこれは正確ではなく、内閣府による解説ページ によれば、国立天文台が官報にて公表するという運用が正解でした。ここに訂正させて頂きます。

    さいごに

    運営の皆さん、はてな の皆さん、会場でご一緒した皆さん、ありがとうございました!


              CES2016프리뷰 / 키워드 C.A.R. 로 본 CES        


    <CES2015에서 VR 기기로 자동차 운전을 하는 사람들 / 사진=손재권>


    "올해는 노스홀(자동차관)부터 가라"

    키워드 C.A.R로 본 CES2016


    손재권 매일경제 기자
     
     2016년은 산업 및 경제구조의 근본적인 변화(Seismic Change)의 시작을 알리는 해가 될 것 같다. 
      단번에 변하지 않을 것이다. 하지만 2010년부터 세계적으로 본격적으로 보급된 스마트폰의 영향으로 변화가 축적됐고 2016년부터는 지각 밑에서 부터 축적된 변화가 꿈틀거려 앞으로 5년간 물리적 공간(Physical world / Brick and mortar), 일상 생활을 바꾸기 시작할 것이다. 
     미국, 유럽, 일부 아시아(한국 일본) 등 선진 시장에서는 스마트폰이 벌써 포화 됐으며 4세대 이동통신도 정착하고 있다. 스마트폰 시대 최고 애플리케이션인 `페이스북' 월 활동 인구도 14억명을 돌파했다. 이중 11억명이 모바일로 페이스북에 1달에 한번 이상 접속한다. 
     기술 발전이 유발하는 근본적 변화는 동시다발적이고 전 지구적이다. 
      실리콘벨리에서는 `화성 식민지를 개발한다', `하이퍼루프 시스템으로 뉴욕과 서울을 4~5시간 만에 주파할 수 있다'는 등 담대한 계획도 소개된다. 그럼에도 정치 사회는 안변했기 때문에(오히려 후퇴하고 있기 때문에) 여기는 20세기인가 21세기인가 22세기인가하는 착각이 들게 한다. 착각이 아닐 것이다. 

    이 세상은 여전히 20세기를 사는 사람도 있는 반면 하이퍼루프로 서울과 뉴욕을 4~5시간 내 주파하겠다는 아이디어를 실현하려는 사람도 공존한다.  


     20세기 인식을 가지고 21세기를 사는 사람과 ê·¸ 인식을 깨고 22세기, 미래를 지향하며 도전하는 앙트러프러너도 공존한다. 
     
     매년 1월 새해가 왔음을 알리는 CES가 5일(현지시각)부터 8일까지 미 라스베이거스에서 개막한다. CES는 `전미가전쇼'로 불리다가 올해부터 그냥 CES로 통일됐다. CES를 주최하는 전미가전협회(CEA : Consumer Electronic Association)가 소비기술협회(CTA : Consumer Technology Association)으로 바뀌었다. 
     CES 성격도 가전 및 PC 업체들의 전시회로 자리매김했다가 지금은 사물인터넷, 스마트카, 드론, 로봇, 3D프린터 등 신기술 경연장으로 바뀌었고 글로벌 기업의 `시제품 및 기술'과시용에서 당장 시장에 나올만한 제품을 전시하는 대회로 바뀌고 있다. 

     이제는 `모바일 기기'도 관심에서 벌어지고 있다. 지난해(2015년)까지만 해도 CES에서 새로운 스마트폰(예를들어 LG G플랙스 등)을 선보이거나 태블릿(중국 회사)을 공개하는 회사도 있었으나 지금은 나왔어도 관심을 받지 못할 정도가 됐다. "모바일? 그래서 뭐?"란 분위기다. 

     TV는 CES의 주인공이었다. 3D TV, OLED, 휘는 TV, 가변형 TV 등이 `세계 최초, 최대'수식어를 달고 보도자료 첫 줄을 장식했으나 이 역시 지금은 "그래서 살건가?"란 분위기로 바뀌었다(TV가 CES에서 ì¡°ì°¨ 관심에서 멀어진 것은 3D TV의 몰락에 기인한 바가 크다). 
     


     그렇다면 CES2016은 어떨까? 올해는 큰 틀의 변화(스타트업, 신기술을 수용해 샌즈 액스포에 별도의 전시장을 마련, Tech East로 명명)의 연장선상에 있지만`변화'가 정착됐음을, 이제 `신기술'이 곧 상용화 될 것임을 암시하는 의미가 있다. 특히 `자율운전 전기차'가 CES의 조연에서 명실상부한 `주연'으로 부상했다. 

     이제는 CES를 참관할 때 삼성전자, LG전자, 소니, 도시바, TCL, 화웨이, 인텔 등이 전시 돼 있는 라스베이거스 컨벤션센터(LVCC)의 센트럴홀이 아니라 포드, GM, 아우디, BMW, 메르스데스 벤츠, 기아차 등이 전시 돼 있는 노스홀 부터 봐야할 것이다. 첨단 정보기술의 흐름은 `자동차'가 선도하게 될 것이며 자율운전차, 전기차는 생각보다 더 빨리(2020년까지 1000만대 글로벌 보급 예상) 우리 일상에 정착하게 될 것이기 때문이다. 

     그런 의미에서 이번 CES2016 프리뷰 키워드는 `C.A.R'로 잡아봤다. C(Car / China), A(AI, Algorism Business), R(VR, AR, 4K Contents, HDR TV)의 조합이다. 
     


     CES2016 첫 키워드는 C. Car 다. CES의 `C'는 가전(Consumer Electonics)이 아닌 자동차(Car)라는 말이 나올 정도다. 

     실제 올해 CES2016에는 현대기아차, 아우디, 벤츠, BMW, 포드 등 10대 완성차 업체가 모두 전시장을 마련하고 헤르베르트 디이스 폭스바겐 승용차 부문 CEO와 메리 배라 제너럴모터스(GM) CEO가 기조 연설자로 나설 예정이다. 

       ì „시면적( 1만 8581㎡)도 지난해보다 25% 늘었으며 부품 회사를 포함하면 150여개 자동차 관련 업체가 전시장을 마련했다. 
     지난해까지 스마트카 기술 중에서도 예년엔 인포테인먼트 및 전장 시스템 소개가 많았지만 올해부터는 다르다. 소수의 지지를 받던`자율운전 전기차'는 이제는 글로벌 10대 완성차 업체들이 대부분 뛰어들며 대세가 됐다. 구글이 자율운전차(Self Driving Car)를 처음 선보인 것이 2012년이라는 점을 고려한다면 매우 놀라운 발전이다. 

     구글과 포드는 CES2016에 자율운전차 비즈니스 협력을 발표할 예정이다. 포드는 그동안 운전자의 앱을 차안에서도 구동할 수 있는 전장 시스템을 선보이는데 주력했으나 올해 구글과 손잡고 `자율운전차'를 전면에 내세울 예정이다. 폭스바겐은 베스트셀링 ì°¨ 마이크로버스를 전기차로 바꾸는 프로젝트를 공개할 것으로 예상된다. 

     한국의 현대기아차도 산타페와 소울의 전기차를 자율운전차로 공개하며 글로벌 무대에 데뷔하게 된다. 기아차는 5일 프레스 컨퍼런스도 개최한다. 이렇게 되면 CES에서 프레스 행사를 여는 한국 업체는 모두 3개가 된다. 
     CES2016에서 현대기아차그룹(현대차, 기아차, 현대모비스)가 전시장 마련, 프레스 컨퍼런스 개최하며 미래 자동차 플랫폼 경쟁에 뛰어들었다는 것은 고무적이다. CES2016에 데뷔함으로써 자율운전 전기차를 만들 수 있는 나라에 포함됐기 때문이다. 

     현재 미국, 독일, 일본, 한국 4개국 만이 자율운전 전기차를 제조, 생산할 수 있다. 추후 합류하는 이탈리아, 프랑스, 중국을 포함해도 7개 나라 뿐이다. 미래형 자동차(전기차, 자율운전차, 수소연료차 등)를 제조, 생산할 수 있다. 
     미래형 자동차는 산업 구조 뿐만 아니라 삶의 방식에도 결정적 영향을 미친다. 친환경(탄소 배출 제로)일 뿐만 아니라 교통 시스템 자체를 바꿀 수 있다. 택시가 사라질 수 있으며 대중 교통이 확대될 것이다. 

     한국은 자율운전 전기차 생태계 확대로 수익 창출 및 일자리 확대를 노릴 수 있는 몇안되는 나라다. 플랫폼(삼성전자), 부품(삼성, LG이노텍), 베터리(삼성SDI, LG화학), 전장 시스템(현대모비스, LG전자, 삼성전자), 통신(SKT, KT, LGU+) 까지 생태계 전반에 걸쳐 글로벌 경쟁력을 갖춘 업체들이 있다. 이정도 생태계를 갖춘 것 만으로도 `미래형 자동차'로 인해 한국에게 `천운'이 왔음을 알리는 것이다. 우리는 이 분야에서 기술,시장, 인재 각각 최선진국과 경쟁과 협력을 해야 한다. 
     한국이 부족한 것은 설계 능력과 디자인 등 최상위급 인재가 필요한 부분과 그리고 정부(각종 규제) 인데 이것은 단기간에 극복이 불가능하다. 

     그래서 앞으로 자율운전 전기차로 신규 일자리를 만들기 위해선 담대한 정책이 필요하다. 정부는 특정 지역(대구)이 아닌 세계 최초로 나라 전체(점진적으로)를 자율운전 규제 프리존으로 만들 필요가 있으며 전기차 인프라를 눈치보지 말고 서둘러 보급해야 한다. 그리고 대학과 연구소에서는 자율운전 전기차 상용화에 필요한 인력을 양성하고 설계 능력, 디자인 능력을 갖추도록 재편해야할 것이다. 신기술 분야에서도 `선택과 집중'을 해야 한다면 단연 `자율 전기차'가 1번으로 꼽혀야 할 것이다. 
     
    테슬라의 자율운전차 기술 (유튜브)


     첫재 키워드 두번째 C는 중국(China)이다. "CES의 C가 중국이다"란 말도 있다. 이는 과장이 아닌데 지난해 CES 참가한 3897개 업체 중 중국 업체는 약 30%인 1154개를 차지했기 때문이다. 중국 업체들은 미국(47%) 업체에 이어 두번째로 많았는데 올해도 다르지 않을 것으로 보인다. 중국 기업들 중 심천(선전)에 기반을 둔 회사는 471개로 한국, 프랑스, 캐나다, 일본, 영국, 독일에서 참가한 업체를 합한 수 보다 많았다. 
     CES2016에서는 화웨이, TCL, 창홍, 레노보, 하이센스 등의 업체들이 센트럴홀에서 한국과 일본 전자 업체들의 아성에 도전하는 모습을 보여주게 될 것이다. 예전엔 중국 업체들이 한국 기업의 기술`흉내내기'에 그친 적도 많았는데 올해는 얼마나 기술적인 진보를 이뤄냈을지 기대된다. 
     삼성전자가 지난해부터 TV 등의 기술 전시에서 사물인터넷(IoT) 등 플랫폼을 소개하는데 주력하고 있기 때문에 더이상 `중국 기업의 삼성 추격'은 관심거리가 안될 것으로 보인다. 
     하지만 사우스홀에 가면 상황이 달라진다. 여기는 그야말로 `중국관'이다. 심천에서 온 업체들 상당수가 사우스홀에 위치해 있다. 드론 업체 DJI가 대표적이다. DJI는 CES 전체 드론 업체를 대표하고 있기도 하다. 


    페러데이 퓨처 티저 동영상 (유튜브)

     여기에 올해는 `패러데이 퓨처(Faraday Future)'라는 업체가 `타도 테슬라'를 내걸면서 4일 첫 컨셉카를 선보일 예정이다. 글로벌 자동차 업계가 패러데이 퓨처의 움직임을 주시하고 있는 등 이미 파란을 예고하고 있다. 
     이 회사는 중국에서 러티브이(Le TV)로 거부가 된 자웨팅이 미국에서 설립한 회사다. 이미 약 1조원 넘게 투자, 네바다주에 공장을 건설 중이기도 하다. 이름부터 전자기 유도현상을 발견한 `패러데이'를 내세우며 테슬라의 대항마가 될 것을 공개적으로 천명했다. 실제 테슬라 연구원 상당수를 스카우트한 것으로 알려져 있다. 최고디자이너는 리처드 김 이라는 한국계 미국인이다. 

     패러데이 퓨처의 등장은 `중국 기업 2.0'을 알리는 신호탄이 될 것으로 보인다. 기존 중국 업체들은 강력한 중국 내수 시장과 정부의 전폭적 지원으로 성장, 글로벌 기업이 됐다. 중국의 석유 및 통신사 외 화웨이, 레노보, 알리바바, 텐센트 등 IT 기업도 예외가 아니다. 샤오미는 1.0와 2.0 사이에 1.5 정도 되는 회사다. 글로벌 플랫폼을 지향하고 해외 인재(휴고 바라 등)를 적극 유치하면서 글로벌 기업이 되고 있으나 여전히 기반은 중국 내수다. 
     하지만 중국인이 자본을 투자, 미국에서 기업을 설립해 글로벌 시장을 지향하는 세련된 업체가 탄생했으니 이 회사가 `패러데이 퓨처'다. 아직 공개되지 않아서 자세히 파괴력을 알수는 없으나 내공이 심상치 않아 보인다. 
     
     CES2016의 두번째 키워드는 A다. 즉 인공지능(A.I) 그리고 알고리즘(Algorism) 비즈니스다. 인공지능은 두말할 것도 없는 미래 핵심 트렌드다. 인공지능 분야는 기술 개발 수준을 넘어 상용화 단계에 도달했으며 시장을 선점하기 위해 플랫폼 선점 경쟁마저 벌어지고 있는 상황이다. 

     구글이 지난해 11월 구글 포토 등에 쓰이는 핵심 인공지능 엔진 `텐서플로(TensorFlow)'를 오픈소스로 공개한다고 밝힌 것이나 엘론 머스크와 피터틸 등이 투자, 비영리 인공지능 기술 개발 재단인 `오픈 AI'를 지난해 12월 출범 시킨 것도 `시장'을 향한 행보다. 인공지능의 선두주자 IBM 왓슨도 소프트뱅크와 협력을 진행 중이다. 1~2년 내로 인공지능 비즈니스가 크게 성장할 것으로 보인다. 

     이번 CES에서는 `인공지능'자체를 다루는 업체는 많지 않다. 그러나 CES2016에서 공개될 차세대 드론과 로봇은 향후 인공지능 소프트웨어를 기본적으로 내장할 것이기 때문에 인공지능 분야의 `킬러 어플리케이션'으로 주목할만하다. 지난해부터 CES에는 `하늘에는 드론 땅에는 로봇'트렌드가 형성됐는데 올해도 재연된다. 

     CES 주최측은 `무인시스템(Unmanned System)'전시장을 따로 마련했으며 지난해 대비 전시 규모(2만5000㎡)를 200% 키웠다고 밝혔다. 드론계의 애플이라 불리는 DJI는 대규모 전시장을 마련했으며 액션카메라 업체 고프로(Go Pro)는 자사 첫 드론(카르마)를 공개할 예정이다. 
     CES2016에서 드론을 눈여겨 봐야하는 이유는 예전엔 `신기하다'`와.. 뜬다'수준이었으나 이제는 다양한 형태의 드론이 나올 뿐만 아니라 기술적으로 크게 진화하는 제품이 나올 것으로 예상되기 때문이다. 

     영국의 한 업체는 연료전지를 이용, 1시간 이상 비행이 가능한 드론을 공개할 예정이며 인텔과 퀄컴 등 칩 업체들도 자체 개발한 드론을 선보일 뿐만 아니라 드론을 위한 칩을 공개하며 산업을 키운다는 발표를 할 계획이다. 

     드론 산업에서 눈여겨봐야할 단어는 `FPV'다. 1인칭 시점(First Person View) 이란 뜻인데 드론에 카메라를 달면 1인칭 시점에서 사물을 ë³¼ 수 있다. 여기에 가상현실 기기를 결합하면 그야말로 시선의 변화를 느낄 수 있다. 미디어의 화법이 3인칭 시점 세계에서 1인칭 시점으로의 획기적인 변화가 예고되는 것이다. 

    올해도 많은 로봇이 전시될 것으로 예상된다. 


     로봇은 인기 아이템이다. 혹자는 `먼 미래 얘기다'ê³  할지 모른다. 하지만 실제 로봇 산업을 들여다보면 이미 `로봇 월드'에 살고 있음을 느끼게 된다. CES2016에서도 지난해비해 로봇 관련 전시가 71% 늘었다. 
     미국의 소셜로봇 업체 지보(Jibo)라는 스타트업은 크라우드 펀딩 인디고고에서만 370만달러의 펀딩을 받았는데 이번 CES에서도 전시하면서 기대를 모으고 있다. 프랑스, 일본, 독일 업체들도 로봇을 경쟁적으로 선보일 예정이다. 지난해 비해 기술이 얼마나 진화했을지 기대된다. 

     또 다른 A 키워드는 `알고리즘 비즈니스'다. 알고리즘 비즈니스는 시장 조사 전문기관 가트너가 향후 핵심 키워드로 꼽은 분야다. 이번 CES에서도 미용과 기술을 결합 새로운 비즈니스를 창출하는 `뷰티테크', 아이들을 안전하게 키울 수 있도록 도와주는 `베이비 테크', 야구 축구 등을 과학적으로 분석하는 `스포츠 테크'등 신비즈니스가 새로 소개된다. <
              ã€æŠ€æœ¯åˆ†äº«ã€‘三种特征向量对深度学习攻击检测的影响        
    【技术分享】三种特征向量对深度学习攻击检测的影响

    2017-08-08 17:41:19

    阅读:949次
    点赞(0)
    收藏
    来源: 安全客





    【技术分享】三种特征向量对深度学习攻击检测的影响

    作者:360天眼实验室





    【技术分享】三种特征向量对深度学习攻击检测的影响

    作者:manning@天眼实验室


    0x00 文章介绍

    深度学习与网络安全结合是未来网络安全的一个大趋势,我们今天以基于深度学习的主流算法对SQL注入行为进行检测,来抛出三种特征向量对深度学习模型检测效果的影响。


    0x01 深度学习简介

    深度学习(Deep Learning)是机器学习的分支,它试图使用包含复杂结构或由多重非线性变换构成的多个处理层对数据进行高层抽象的算法。深度学习是机器学习中一种基于对数据进行表征学习的方法。深度学习的好处是用非监督式或半监督式的特征学习和分层特征提取高效算法来替代手工获取特征。在我们的实验中,使用的是python深度学习库: TensorFlow。使用的模型是:

    多层感知器

    多层感知器(Multilayer Perceptron,缩写MLP)是一种前向结构的人工神经网络,映射一组输入向量到一组输出向量。MLP可以被看作是一个有向图,由多个的节点层所组成,每一层都全连接到下一层。除了输入节点,每个节点都是一个带有非线性激活函数的神经元(或称处理单元)。详细介绍

    卷积神经网络

    卷积神经网络(Convolutional Neural Network, CNN)是一种前馈神经网络,它的人工神经元可以响应一部分覆盖范围内的周围单元,对于大型图像处理有出色表现。卷积神经网络由一个或多个卷积层和顶端的全连通层(对应经典的神经网络)组成,同时也包括关联权重和池化层(pooling layer)。这一结构使得卷积神经网络能够利用输入数据的二维结构。与其他深度学习结构相比,卷积神经网络在图像和语音识别方面能够给出更优的结果。这一模型也可以使用反向传播算法进行训练。相比较其他深度、前馈神经网络,卷积神经网络需要估计的参数更少,使之成为一种颇具吸引力的深度学习结构。详细介绍

    循环神经网络

    递归神经网络(RNN)是两种人工神经网络的总称。一种是时间递归神经网络(recurrent neural network),另一种是结构递归神经网络(recursive neural network)。时间递归神经网络的神经元间连接构成有向图,而结构递归神经网络利用相似的神经网络结构递归构造更为复杂的深度网络。RNN一般指代时间递归神经网络。单纯递归神经网络因为无法处理随着递归,权重指数级爆炸或消失的问题(Vanishing gradient problem),难以捕捉长期时间关联;而结合不同的LSTM可以很好解决这个问题。详细介绍

    实验中使用的网络结构

    多层感知器

    神经网络结构为:

    输入层

    隐藏层 L1

    隐藏层 L2

    隐藏层 L3

    输出层

    每个隐藏层使用128个神经元,激活函数为relu。


    【技术分享】三种特征向量对深度学习攻击检测的影响

    上图为TensorBoard输出的结构图。

    实验中使用的网络结构

    卷积神经网络

    神经网络结构为:

    输入层

    卷积层

    池化层

    卷积层

    池化层

    全连接层

    输出层

    循环神经网络

    神经网络结构为:

    输入层

    向前层

    向后层

    输出层


    【技术分享】三种特征向量对深度学习攻击检测的影响

    PS:训练集和测试集来自于360企业安全-天眼大数据平台,模型纯度良好。


    0x02 特征向量介绍

    我们的特征向量转化,使用了三种方法,也是目前应对字符串比较好的方法选择。

    基于word2vec的特征向量

    基于词袋的特征向量

    基于fofe的特征向量

    基于word2vec的特征向量

    word2vec可以根据模型把词汇转化成一个多维的特征向量,在构建语句的特征时,我们采用暴力的向量相加的方式。

    word2vec在自然语言的实验中,可以很好的表示词语见的关系。具体可以参考维基百科语料中的词语相似度探索

    基于词袋的特征向量

    词袋向量,我们在天眼实验室的攻击平台上,挑选了在SQL注入中最常出现的250个词汇,构建词袋模型。

    词袋模型的参考 BoW(词袋)模型详细介绍

    基于FOFE的特征向量

    FOFE是一种简单精妙的rule-base编码方式。通俗的说就是,在one-hot的基础上利用了数值的大小表明了词的位置信息的一种编码形式。我们基于上面词袋模型的基础,加入了FOFE算法。

    FOFE算法的具体论文,来自江辉老师。

    The Fixed-Size Ordinally-Forgetting Encoding Method for Neural Network Language Models


    0x03 实验结果分析

    我们的训练数据为50000条,测试数据为500000条。


    【技术分享】三种特征向量对深度学习攻击检测的影响

    三种向量结果都表现了非常好的准确度。


    【技术分享】三种特征向量对深度学习攻击检测的影响

    从上图可以看出,基于FOFE的特征向量和词袋特征向量的表现并没有出现特别明显的差距,位置元素的融入并没有给FOFE特征向量带来明显的检测水平的提升。word2vec的向量在真实集表现的不是很好,其中的原因是我们建立句子,使用的是向量相加的粗暴方法,并不能体现word2vec对句子的属性体现。


    【技术分享】三种特征向量对深度学习攻击检测的影响

    从上图可以看出,基于word2vec的特征向量的判断速度明显慢于其他两种方法。基于词袋的速度比基于fofe的速度快一点,本质原因是fofe算法的引入,带来了一定的计算量,符合速度降低的预期。


    0x04 总结

    笔者认为,本次我们利用三种建立向量的方式和三种神经网络结构进行交叉实验,探讨三种方式的向量形式和神经网络结构之间的关系,算是抛砖引玉。本次实验最为惊讶的是 CNN 和 word2vec的组合在真实集表现的最好。基于FOFE的特征向量具有顺序的概念,但是未能在词袋模型的基础上带来更好的检测结果。

    深度神经网络在安全检测方面,可以带领我们进入检测“未知的未知”的能力层次,这点也是我们必须要对此付出努力的方向。路要一步一步走,我们会在这个方向上继续前行。


    0x05 参考引用

    https://zh.wikipedia.org/wiki/%E6%B7%B1%E5%BA%A6%E5%AD%A6%E4%B9%A0

    https://yq.aliyun.com/articles/118686?spm=5176.100239.0.0.g2XnLx

    http://www.52nlp.cn/tag/word2vec

    http://blog.csdn.net/u010213393/article/details/40987945



    【技术分享】三种特征向量对深度学习攻击检测的影响
    【技术分享】三种特征向量对深度学习攻击检测的影响
    本文由 安全客 原创发布,如需转载请注明来源及本文地址。
    本文地址:http://bobao.360.cn/learning/detail/4224.html

              Senior AI Solution Architect - Innodata Labs - Remote        
    Our stack includes Python, TensorFlow, Hadoop, Messaging (RabbitMQ, Kafka), Docker / Kubernetes and cloud infrastructure (AWS, Google Cloud) as well as various...
    From Innodata Labs - Fri, 05 May 2017 23:23:37 GMT - View all Remote jobs
              como se programa una inteligencia artificial        

    como se programa una inteligencia artificial

    Respuesta a como se programa una inteligencia artificial

    pues para inteligencia artificial tengo entendido que se usan lenguajes de programación especializados para eso como son : Prolog y Racket, pero también puedes usar bibliotecas para diferentes lenguajes de programación las cuales brindan herramientas de inteligencia artificial como por ejemplo la libreria TensorFlow de google que si no me equivoco te podría servir para realizar algunas tareas de inteligencia artificial

    Publicado el 21 de Junio del 2017 por jonathan

              exportar modelo red neuronal para app        

    exportar modelo red neuronal para app

    Hola,

    Estoy comenzando con las redes neuronales y he estado probando tensorflow. Tras un tutorial típico de interpretar una letra escrita a mano y vista las futuras posibilidades del tema y sabiendo que el deep learning para analizar por ejemplo, objetos de un vídeo, puede ser costoso en cuanto a hardware, me ha salido la siguiente duda:


    Supongamos que entreno una red neuronal para reconocer caras en un streaming a tiempo real, digamos que las marca con un r...

    Publicado el 15 de Junio del 2017 por david

              Supercharge your Computer Vision models with the TensorFlow Object Detection API        
    none
              Supercharge your Computer Vision models with the TensorFlow Object Detection API        
    none
              Machine Learning with Python Course and E-Book Bundle for $49        
    4 E-Books & 5 Courses to Help You Perform Machine Learning Analytics & Command High-Paying Jobs
    Expires January 22, 2022 23:59 PST
    Buy now and get 92% off

    Deep Learning with TensorFlow


    KEY FEATURES

    Deep learning is the intersection of statistics, artificial intelligence, and data to build accurate models, and is one of the most important new frontiers in technology. TensorFlow is one of the newest and most comprehensive libraries for implementing deep learning. Over this course you'll explore some of the possibilities of deep learning, and how to use TensorFlow to process data more effectively than ever.

    • Access 22 lectures & 2 hours of content 24/7
    • Discover the efficiency & simplicity of TensorFlow
    • Process & change how you look at data
    • Sift for hidden layers of abstraction using raw data
    • Train your machine to craft new features to make sense of deeper layers of data
    • Explore logistic regression, convolutional neural networks, recurrent neural networks, high level interfaces, & more

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Dan Van Boxel is a Data Scientist and Machine Learning Engineer with over 10 years of experience. He is most well-known for "Dan Does Data," a YouTube livestream demonstrating the power and pitfalls of neural networks. He has developed and applied novel statistical models of machine learning to topics such as accounting for truck traffic on highways, travel time outlier detection, and other areas. Dan has also published research and presented findings at the Transportation Research Board and other academic journals.

    Beginning Python


    KEY FEATURES

    Python is the general purpose, multi-paradigm programming language that many professionals consider one of the best beginner language due its relative simplicity and applicability to many coding arenas. This course assumes no prior experience and helps you dive into Python fundamentals to come to grips with this popular language and start your coding odyssey off right.

    • Access 43 lectures & 4.5 hours of content 24/7
    • Learn variables, numbers, strings, & more essential components of Python
    • Make decisions on your programs w/ conditional statements
    • See how functions play a major role in providing a high degree of code recycling
    • Create modules in Python
    • Perform image manipulations w/ Python

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    William Fiset is a Mathematics and Computer Science Honors student at Mount Allison University with in interest in competitive programming. William has been a Python developer for +4 years, starting his early Python experience with game development. He owns a popular YouTube channel that teaches Python to beginners and the basics of game development.

    Deep Learning with Python


    KEY FEATURES

    You've seen deep learning everywhere, but you may not have realized it. This discipline is one of the leading solutions for image recognition, speech recognition, object recognition, and language translation - basically the tools you see Google roll out every day. Over this course, you'll use Python to expand your deep learning knowledge to cover backpropagation and its ability to train neural networks.

    • Access 19 lectures & 2 hours of content 24/7
    • Train neural networks in deep learning & to understand automatic differentiation
    • Cover convolutional & recurrent neural networks
    • Build up the theory that covers supervised learning
    • Integrate search & image recognition, & object processing
    • Examine the performance of the sentimental analysis model

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Eder Santana is a PhD candidate in Electrical and Computer Engineering. His thesis topic is on Deep and Recurrent neural networks. After working for 3 years with Kernel Machines (SVMs, Information Theoretic Learning, and so on), Eder moved to the field of deep learning 2.5 years ago, when he started learning Theano, Caffe, and other machine learning frameworks. Now, Eder contributes to Keras: Deep Learning Library for Python. Besides deep learning, he also likes data visualization and teaching machine learning, either on online forums or as a teacher assistant.

    Data Mining with Python


    KEY FEATURES

    Every business wants to gain insights from data to make more informed decisions. Data mining provides a way of finding these insights, and Python is one of the most popular languages with which to perform it. In this course, you will discover the key concepts of data mining and learn how to apply different techniques to gain insight to real-world data. By course's end, you'll have a valuable skill that companies are clamoring to hire for.

    • Access 21 lectures & 2 hours of content 24/7
    • Discover data mining techniques & the Python libraries used for data mining
    • Tackle notorious data mining problems to get a concrete understanding of these techniques
    • Understand the process of cleaning data & the steps involved in filtering out noise
    • Build an intelligent application that makes predictions from data
    • Learn about classification & regression techniques like logistic regression, k-NN classifier, & mroe
    • Predict house prices & the number of TV show viewers

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Saimadhu Polamuri is a data science educator and the founder of Data Aspirant, a Data Science portal for beginners. He has 3 years of experience in data mining and 5 years of experience in Python. He is also interested in big data technologies such as Hadoop, Pig, and Spark. He has a good command of the R programming language and Matlab. He has a rudimentary understanding of Cpp Computer vision library (opencv) and big data technologies.

    Data Visualization: Representing Information on the Modern Web E-Book


    KEY FEATURES

    You see graphs all over the internet, the workplace, and your life - but do you ever stop to consider how all that data has been visualized? There are many tools and programs that data scientists use to visualize massive, disorganized sets of data. This e-book contains content from "Data Visualization: A Successful Design Process" by Andy Kirk, "Social Data Visualization with HTML5 and JavaScript" by Simon Timms," and "Learning d3.js Data Visualization, Second Edition" by Andrew Rininsland and Swizec Teller, all professionally curated to give you an easy-to-follow track to master data visualization in your own work.

    • Harness the power of D3 by building interactive & real-time data-driven web visualizations
    • Find out how to use JavaScript to create compelling visualizations of social data
    • Identify the purpose of your visualization & your project’s parameters to determine overriding design considerations across your project’s execution
    • Apply critical thinking to visualization design & get intimate with your dataset to identify its potential visual characteristics
    • Explore the various features of HTML5 to design creative visualizations
    • Discover what data is available on Stack Overflow, Facebook, Twitter, & Google+
    • Gain a solid understanding of the common D3 development idioms

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Packt’s mission is to help the world put software to work in new ways, through the delivery of effective learning and information services to IT professionals. Working towards that vision, it has published over 3,000 books and videos so far, providing IT professionals with the actionable knowledge they need to get the job done–whether that’s specific learning on an emerging technology or optimizing key skills in more established tools.

    Python: Master the Art of Design Patterns E-Book


    KEY FEATURES

    Get a complete introduction to the many uses of Python in this curated e-book drawing content from "Python 3 Object-Oriented Programming, Second Edition" by Dusty Phillips, "Learning Python Design Patterns, Second Edition" by Chetan Giridhar, and "Mastering Python Design Patterns" by Sakis Kasampalis. Once you've got your feet wet, you'll focus in on the most common and useful design patterns from a Python perspective. By course's end, you'll have a complex understanding of designing patterns with Python, allowing you to develop better coding practices and create systems architectures.

    • Discover what design patterns are & how to apply them to writing Python
    • Implement objects in Python by creating classes & defining methods
    • Separate related objects into a taxonomy of classes & describe the properties & behaviors of those objects via the class interface
    • Understand when to use object-oriented features & when not to use them
    • Explore the design principles that form the basis of software design, such as loose coupling, the Hollywood principle, & the Open Close principle, & more
    • Use Structural Design Patterns to find out how objects & classes interact to build larger applications
    • Improve the productivity & code base of your application using Python design patterns
    • Secure an interface using the Proxy pattern

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Packt’s mission is to help the world put software to work in new ways, through the delivery of effective learning and information services to IT professionals. Working towards that vision, it has published over 3,000 books and videos so far, providing IT professionals with the actionable knowledge they need to get the job done–whether that’s specific learning on an emerging technology or optimizing key skills in more established tools.

    Python: Deeper Insights into Machine Learning E-Book


    KEY FEATURES

    Machine learning and predictive analytics are becoming one of the key strategies for unlocking growth in a challenging contemporary marketplace. Consequently, professionals who can run machine learning systems are in high demand and are commanding high salaries. This e-book will help you get a grip on advanced Python techniques to design machine learning systems.

    • Learn to write clean & elegant Python code that will optimize the strength of your algorithms
    • Uncover hidden patterns & structures in data w/ clustering
    • Improve accuracy & consistency of results using powerful feature engineering techniques
    • Gain practical & theoretical understanding of cutting-edge deep learning algorithms
    • Solve unique tasks by building models
    • Come to grips w/ the machine learning design process

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Packt’s mission is to help the world put software to work in new ways, through the delivery of effective learning and information services to IT professionals. Working towards that vision, it has published over 3,000 books and videos so far, providing IT professionals with the actionable knowledge they need to get the job done–whether that’s specific learning on an emerging technology or optimizing key skills in more established tools.

    Python: Real-World Data Science E-Book


    KEY FEATURES

    Data science is one of the most in-demand fields today, and this e-book will guide you to becoming an efficient data science practitioner in Python. Once you've nailed down Python fundamentals, you'll learn how to perform data analysis with Python in an example-driven way. From there, you'll learn how to scale your knowledge to processing machine learning algorithms.

    • Implement objects in Python by creating classes & defining methods
    • Get acquainted w/ NumPy to use it w/ arrays & array-oriented computing in data analysis
    • Create effective visualizations for presenting your data using Matplotlib
    • Process & analyze data using the time series capabilities of pandas
    • Interact w/ different kind of database systems, such as file, disk format, Mongo, & Redis
    • Apply data mining concepts to real-world problems
    • Compute on big data, including real-time data from the Internet
    • Explore how to use different machine learning models to ask different questions of your data

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Packt’s mission is to help the world put software to work in new ways, through the delivery of effective learning and information services to IT professionals. Working towards that vision, it has published over 3,000 books and videos so far, providing IT professionals with the actionable knowledge they need to get the job done–whether that’s specific learning on an emerging technology or optimizing key skills in more established tools.

    Mastering Python


    KEY FEATURES

    Python is one of the most popular programming languages today, enabling developers to write efficient, reusable code. Here, you'll add Python to your repertoire, learning to set up your development environment, master use of its syntax, and much more. You'll soon understand why engineers at startups like Dropbox rely on Python: it makes the process of creating and iterating upon apps a piece of cake.

    • Master Python w/ 3 hours of content
    • Build Python packages to efficiently create reusable code
    • Creating tools & utility programs, and write code to automate software
    • Distribute computation tasks across multiple processors
    • Handle high I/O loads w/ asynchronous I/O for smoother performance
    • Utilize Python's metaprogramming & programmable syntax features
    • Implement unit testing to write better code, faster

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Packt’s mission is to help the world put software to work in new ways, through the delivery of effective learning and information services to IT professionals. Working towards that vision, it has published over 3,000 books and videos so far, providing IT professionals with the actionable knowledge they need to get the job done –whether that’s specific learning on an emerging technology or optimizing key skills in more established tools.

              The Advanced Guide to Deep Learning and Artificial Intelligence Bundle for $42        
    This High-Intensity 14.5 Hour Bundle Will Help You Help Computers Address Some of Humanity's Biggest Problems
    Expires November 28, 2021 23:59 PST
    Buy now and get 91% off

    Deep Learning: Convolutional Neural Networks in Python


    KEY FEATURES

    In this course, intended to expand upon your knowledge of neural networks and deep learning, you'll harness these concepts for computer vision using convolutional neural networks. Going in-depth on the concept of convolution, you'll discover its wide range of applications, from generating image effects to modeling artificial organs.

    • Access 25 lectures & 3 hours of content 24/7
    • Explore the StreetView House Number (SVHN) dataset using convolutional neural networks (CNNs)
    • Build convolutional filters that can be applied to audio or imaging
    • Extend deep neural networks w/ just a few functions
    • Test CNNs written in both Theano & TensorFlow
    Note: we strongly recommend taking The Deep Learning & Artificial Intelligence Introductory Bundle before this course.

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels, but you must have some knowledge of calculus, linear algebra, probability, Python, Numpy, and be able to write a feedforward neural network in Theano and TensorFlow.
    • All code for this course is available for download here, in the directory cnn_class

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

    Unsupervised Deep Learning in Python


    KEY FEATURES

    In this course, you'll dig deep into deep learning, discussing principal components analysis and a popular nonlinear dimensionality reduction technique known as t-distributed stochastic neighbor embedding (t-SNE). From there you'll learn about a special type of unsupervised neural network called the autoencoder, understanding how to link many together to get a better performance out of deep neural networks.

    • Access 30 lectures & 3 hours of content 24/7
    • Discuss restricted Boltzmann machines (RBMs) & how to pretrain supervised deep neural networks
    • Learn about Gibbs sampling
    • Use PCA & t-SNE on features learned by autoencoders & RBMs
    • Understand the most modern deep learning developments

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: intermediate, but you must have some knowledge of calculus, linear algebra, probability, Python, Numpy, and be able to write a feedforward neural network in Theano and TensorFlow.
    • All code for this course is available for download here, in the directory unsupervised_class2

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

    Deep Learning: Recurrent Neural Networks in Python


    KEY FEATURES

    A recurrent neural network is a class of artificial neural network where connections form a directed cycle, using their internal memory to process arbitrary sequences of inputs. This makes them capable of tasks like handwriting and speech recognition. In this course, you'll explore this extremely expressive facet of deep learning and get up to speed on this revolutionary new advance.

    • Access 32 lectures & 4 hours of content 24/7
    • Get introduced to the Simple Recurrent Unit, also known as the Elman unit
    • Extend the XOR problem as a parity problem
    • Explore language modeling
    • Learn Word2Vec to create word vectors or word embeddings
    • Look at the long short-term memory unit (LSTM), & gated recurrent unit (GRU)
    • Apply what you learn to practical problems like learning a language model from Wikipedia data

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels, but you must have some knowledge of calculus, linear algebra, probability, Python, Numpy, and be able to write a feedforward neural network in Theano and TensorFlow.
    • All code for this course is available for download here, in the directory rnn_class

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

    Natural Language Processing with Deep Learning in Python


    KEY FEATURES

    In this course you'll explore advanced natural language processing - the field of computer science and AI that concerns interactions between computer and human languages. Over the course you'll learn four new NLP architectures and explore classic NLP problems like parts-of-speech tagging and named entity recognition, and use recurrent neural networks to solve them. By course's end, you'll have a firm grasp on natural language processing and its many applications.

    • Access 40 lectures & 4.5 hours of content 24/7
    • Discover Word2Vec & how it maps words to a vector space
    • Explore GLoVe's use of matrix factorization & how it contributes to recommendation systems
    • Learn about recursive neural networks which will help solve the problem of negation in sentiment analysis

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: advanced, but you must have some knowledge of calculus, linear algebra, probability, Python, Numpy, and be able to write a feedforward neural network in Theano and TensorFlow.
    • All code for this course is available for download here, in the directory nlp_class2

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

              Practical Deep Learning in Theano and TensorFlow for $29        
    Build & Understand Neural Networks Using Two of the Most Popular Deep Learning Techniques
    Expires November 02, 2021 23:59 PST
    Buy now and get 75% off

    KEY FEATURES

    The applications of Deep Learning are many, and constantly growing, just like the neural networks that it supports. In this course, you'll delve into advanced concepts of Deep Learning, starting with the basics of TensorFlow and Theano, understanding how to build neural networks with these popular tools. Using these tools, you'll learn how to build and understand a neural network, knowing exactly how to visualize what is happening within a model as it learns.

    • Access 23 lectures & 3 hours of programming 24/7
    • Discover batch & stochastic gradient descent, two techniques that allow you to train on a small sample of data at each iteration, greatly speeding up training time
    • Discuss how momentum can carry you through local minima
    • Learn adaptive learning rate techniques like AdaGrad & RMSprop
    • Explore dropout regularization & other modern neural network techniques
    • Understand the variables & expressions of TensorFlow & Theano
    • Set up a GPU-instance on AWS & compare the speed of CPU vs GPU for training a deep neural network
    • Look at the MNIST dataset & compare against known benchmarks
    Like what you're learning? Try out the The Advanced Guide to Deep Learning and Artificial Intelligence next.

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels, but you must have some knowledge of calculus, linear algebra, probability, Python, and Numpy
    • All code for this course is available for download here, in the directory ann_class2

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

              The Deep Learning and Artificial Intelligence Introductory Bundle for $39        
    Companies Are Relying on Artificial Intelligence to Learn Faster Than Ever. Time to Catch Up.
    Expires October 31, 2021 23:59 PST
    Buy now and get 91% off

    Deep Learning Prerequisites: Linear Regression in Python


    KEY FEATURES

    Deep Learning is a set of powerful algorithms that are the force behind self-driving cars, image searching, voice recognition, and many, many more applications we consider decidedly "futuristic." One of the central foundations of deep learning is linear regression; using probability theory to gain deeper insight into the "line of best fit." This is the first step to building machines that, in effect, act like neurons in a neural network as they learn while they're fed more information. In this course, you'll start with the basics of building a linear regression module in Python, and progress into practical machine learning issues that will provide the foundations for an exploration of Deep Learning.

    • Access 20 lectures & 2 hours of content 24/7
    • Use a 1-D linear regression to prove Moore's Law
    • Learn how to create a machine learning model that can learn from multiple inputs
    • Apply multi-dimensional linear regression to predict a patient's systolic blood pressure given their age & weight
    • Discuss generalization, overfitting, train-test splits, & other issues that may arise while performing data analysis
    Like what you're learning? Try out the The Advanced Guide to Deep Learning and Artificial Intelligence next.

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels, but you must have some knowledge of calculus, linear algebra, probability, Python, and Numpy
    • All code for this course is available for download here, in the directory linear_regression_class

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

    Deep Learning Prerequisites: Logistic Regression in Python


    KEY FEATURES

    Logistic regression is one of the most fundamental techniques used in machine learning, data science, and statistics, as it may be used to create a classification or labeling algorithm that quite resembles a biological neuron. Logistic regression units, by extension, are the basic bricks in the neural network, the central architecture in deep learning. In this course, you'll come to terms with logistic regression using practical, real-world examples to fully appreciate the vast applications of Deep Learning.

    • Access 31 lectures & 3 hours of content 24/7
    • Code your own logistic regression module in Python
    • Complete a course project that predicts user actions on a website given user data
    • Use Deep Learning for facial expression recognition
    • Understand how to make data-driven decisions
    Like what you're learning? Try out the The Advanced Guide to Deep Learning and Artificial Intelligence next.

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels, but you must have some knowledge of calculus, linear algebra, probability, Python, and Numpy
    • All code for this course is available for download here, in the directory logistic_regression_class

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

    Data Science: Deep Learning in Python


    KEY FEATURES

    Artificial neural networks are the architecture that make Apple's Siri recognize your voice, Tesla's self-driving cars know where to turn, Google Translate learn new languages, and so many more technological features you have quite possibly taken for granted. The data science that unites all of them is Deep Learning. In this course, you'll build your very first neural network, going beyond basic models to build networks that automatically learn features.

    • Access 37 lectures & 4 hours of content 24/7
    • Extend the binary classification model to multiple classes uing the softmax function
    • Code the important training method, backpropagation, in Numpy
    • Implement a neural network using Google's TensorFlow library
    • Predict user actions on a website given user data using a neural network
    • Use Deep Learning for facial expression recognition
    • Learn some of the newest development in neural networks
    Like what you're learning? Try out the The Advanced Guide to Deep Learning and Artificial Intelligence next.

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: intermediate, but you must have some knowledge of calculus, linear algebra, probability, Python, and Numpy
    • All code for this course is available for download here, in the directory ann_class

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

    Data Science: Practical Deep Learning in Theano & TensorFlow


    KEY FEATURES

    The applications of Deep Learning are many, and constantly growing, just like the neural networks that it supports. In this course, you'll delve into advanced concepts of Deep Learning, starting with the basics of TensorFlow and Theano, understanding how to build neural networks with these popular tools. Using these tools, you'll learn how to build and understand a neural network, knowing exactly how to visualize what is happening within a model as it learns.

    • Access 23 lectures & 3 hours of programming 24/7
    • Discover batch & stochastic gradient descent, two techniques that allow you to train on a small sample of data at each iteration, greatly speeding up training time
    • Discuss how momentum can carry you through local minima
    • Learn adaptive learning rate techniques like AdaGrad & RMSprop
    • Explore dropout regularization & other modern neural network techniques
    • Understand the variables & expressions of TensorFlow & Theano
    • Set up a GPU-instance on AWS & compare the speed of CPU vs GPU for training a deep neural network
    • Look at the MNIST dataset & compare against known benchmarks
    Like what you're learning? Try out the The Advanced Guide to Deep Learning and Artificial Intelligence next.

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels, but you must have some knowledge of calculus, linear algebra, probability, Python, and Numpy
    • All code for this course is available for download here, in the directory ann_class2

    Compatibility

    • Internet required

    THE EXPERT

    The Lazy Programmer is a data scientist, big data engineer, and full stack software engineer. For his master's thesis he worked on brain-computer interfaces using machine learning. These assist non-verbal and non-mobile persons to communicate with their family and caregivers.

    He has worked in online advertising and digital media as both a data scientist and big data engineer, and built various high-throughput web services around said data. He has created new big data pipelines using Hadoop/Pig/MapReduce, and created machine learning models to predict click-through rate, news feed recommender systems using linear regression, Bayesian Bandits, and collaborative filtering and validated the results using A/B testing.

    He has taught undergraduate and graduate students in data science, statistics, machine learning, algorithms, calculus, computer graphics, and physics for students attending universities such as Columbia University, NYU, Humber College, and The New School.

    Multiple businesses have benefitted from his web programming expertise. He does all the backend (server), frontend (HTML/JS/CSS), and operations/deployment work. Some of the technologies he has used are: Python, Ruby/Rails, PHP, Bootstrap, jQuery (Javascript), Backbone, and Angular. For storage/databases he has used MySQL, Postgres, Redis, MongoDB, and more.

              NSI Instruments        
    Magenta is a Google Brain project that has created something called NSynth which seems to be extracting the audio DNA from samples and then mixing it all together in a user controllable synth. While I can't give you the synth (it is highly confusing) which will hopefully someday come out as a VST, I can apparently use the NSynth Dataset to create more traditional though possibly funky sounding instruments.

    Acoustic Strings            Volume 1

    20 string 'instruments', 4,000 samples, uses Big Bob's WIPS script for fake round robins and on sustains, legato.


    Acoustic Keyboards        Volume 1
    Soundcloud Demo

    Acoustic Brass                Volume 1


    License: The dataset is made available by Google Inc. under a Creative Commons Attribution 4.0 International (CC BY 4.0) license.

    Paper: Jesse Engel, Cinjon Resnick, Adam Roberts, Sander Dieleman, Douglas Eck, Karen Simonyan, and Mohammad Norouzi. "Neural Audio Synthesis of Musical Notes with WaveNet Autoencoders." 2017.
              Android と TensorFlow: Google I/O 2017 の Codelab を試す        

    こんにちは。Google I/O 行きたかった中山です。
    今年の I/O もいろいろと興味深い発表がありました。
    多くの発表内容を効率的に理解する上で役立つのが、あわせて公開されている Codelabs です。
    実際にコードを書きながら、動きを確認することができる、大変よいコンテンツだと思います。
    今回は、Android からの取り扱いが劇的に簡単になったという、TensorFlow について、この Codelabs を試しながら、理解を深めてみたいと思います。

    1.Codelab:「Android & TensorFlow: Artistic Style Transfer」について

    今回試しますのは、Android アプリに TensorFlow を組み込んで、「Artistic style transfer」を試してみよう、というこちらの Codelab。

    「Artistic style transfer」というのは、ディープラーニングの仕組みを利用して、ある画像に、別の画像の表現スタイルだけを適用して、新しい画像を生成するもの。
    具体的には、以下のように、

    1. 左の画像に、
    2. 真ん中の画像のスタイルを適用して、
    3. 右の画像を生成する、

    というイメージです。
    https://codelabs.developers.google.com/codelabs/tensorflow-style-transfer-android/img/c8b30d69a632f9a2.png

    この Codelab で、「Artistic style transfer」の機能を試すことで、同時に、

    • TensorFlow のライブラリを、Android アプリで利用すること
    • 学習済みの TensorFlow のモデルを、Android アプリに組み込んで利用すること
    • 学習済みのモデルを元に「推論」のプロセスを実行すること
    • TensorFlow Graph の、特定の「ノード」に、アプリからアクセスできること

    あたりを、合わせて確認できるようになっているようです。
    細かく見ていくには、やはり TensorFlow 自体の理解、というのが必須になってはくるのですが、まず今回は、Android アプリから、こんなにシンプルに、TensorFlow の機能を利用できるのだ、というところに焦点をあてて、触っていきたいと思います。

    なおこの Codelab を試すにあたって、利用したローカル側環境は以下の通りです。

    • Mac(macOS Sierra)
    • Android Studio(2.3.1)
    • Nexus5X

    2.ソースコードの取得

    ターミナルから、任意のディレクトリで以下コマンドを実行して、アプリのプロジェクトのソースコードを取得します。

    git clone https://github.com/googlecodelabs/tensorflow-style-transfer-android
    

    3.Android Studio で開く

    git clone で取得してできた「tensorflow-style-transfer-android」フォルダ内にある「android」フォルダを選択して、Android Studio で開きます。
     
    f:id:IntelligentTechnology:20170522171839p:plain:w440

    ビルドが完了後、以下のようなファイル構成となっていることが、Android Studio 上で確認できます。
     
     f:id:IntelligentTechnology:20170522175308p:plain:w320

    なおこの時点では、まだ TensorFlow 関連の処理は実装されていない、「枠」だけの状態です。
    この状態でアプリを起動すると、以下のようになります。

    https://codelabs.developers.google.com/codelabs/tensorflow-style-transfer-android/img/725a01c7c2722be7.png

    上半分はカメラプレビュー、下半分は、「スタイル」を適用するためのボタンと、適用対象の「スタイル」の候補が並んでいます。

    4.重要なメソッド

    処理の中心となるのは、プロジェクト内に用意されている「StylizeActivity」クラスのようです。
    このクラスの中の、以下のメソッドが、重要な役目をになっている、とのことです。

    onPreviewSizeChosen

    当アプリで実装している、カメラの機能が利用可能になった際に、このメソッドが呼び出されます。
    初期設定のいろいろの処理をここで実行することになります。

    setStyle

    適用するスタイルを指定するメソッドです。

    renderDebug

    スタイル適用時のデバッグ情報を出力するメソッドです。

    stylizeImage

    指定されたスタイルを、画像に適用するメソッドです。

    ImageUtils クラスの各種メソッド

    プロジェクト内に用意されている ImageUtils クラスでは、画像処理に関するヘルパーメソッドがいくつか実装されています。
    これらは、パフォーマンス対策のため、C++ で実装されたものを利用しています。コードの本体は、以下のように「libtensorflow_demo.so」としてあらかじめビルドされた状態で提供されています。
     
     f:id:IntelligentTechnology:20170523101043p:plain:w280

    StylizeActivity クラスからも、この ImageUtils クラスのメソッドを利用しています。

    5.TensorFlow 学習済みモデルについて

    今回利用する学習済み TensorFlow モデル、以下の「stylize_quantized.pb」が本体となっています。
     
     f:id:IntelligentTechnology:20170523102534p:plain:w280

    構成としては、こちらのページにも解説があるとおり、以下のように表現される、とのことなのですが、

    https://codelabs.developers.google.com/codelabs/tensorflow-style-transfer-android/img/dac7f10dfec75e53.png

    とりあえず、今回の Codelab を実行する限りにおいては、これらの詳しい仕様については、すべて把握できていなくても支障ないとのこと。
    とはいえ、今後、TensorFlow をより使いこなしていくためには、このあたりの知識も必須になるのでしょう。

    6.TensorFlow ライブラリの読み込み

    「build.gradle」ファイルを開いて、
     
     f:id:IntelligentTechnology:20170523110048p:plain:w320

    ファイル末尾の「dependencies」ブロック内に、以下のように、「compile ・・・」を追記し、画面右上に表示される「Sync Now」をクリックします。

    android {
      ・・・
      dependencies {
        compile 'org.tensorflow:tensorflow-android:1.2.0-preview'
      }
    }

    なんと、build.gradle ファイルに 1行書くだけで、TensorFlow ライブラリを利用できるようになりました。
    以前は、TensorFlow 自体をビルドしておいて、NDK 入れて、・・・などなど下準備が大変だったようなのですが、時代は変わりました。

    7.コードの追加

    「StylizeActivity」クラスに、以下の private フィールドを追加します。

    // TensorFlow ライブラリが提供する、
    // TensorFlow の機能を呼び出すためのインタフェースとなるクラス
    private TensorFlowInferenceInterface inferenceInterface;
    // 学習済みモデルのファイル
    private static final String MODEL_FILE = 
      "file:///android_asset/stylize_quantized.pb";
    // TensorFlow の「ノード」の識別名
    private static final String INPUT_NODE = "input";
    private static final String STYLE_NODE = "style_num";
    private static final String OUTPUT_NODE = 
      "transformer/expand/conv3/conv/Sigmoid";
    

    また「onPreviewSizeChosen」メソッド内に、以下のように「inferenceInterface」フィールドを初期化するコードを追加します。

    @Override
    public void onPreviewSizeChosen(final Size size, final int rotation) {
      ・・・
      // inferenceInterface フィールドの初期化
      inferenceInterface = 
        new TensorFlowInferenceInterface(getAssets(), MODEL_FILE);
    }
    

    次に、「stylizeImage」メソッドの、

    // TODO: Process the image in TensorFlow here.
    

    と書かれている部分の下側に、以下のように、inferenceInterface のメソッドを呼び出すコードを追加します。

    private void stylizeImage(final Bitmap bitmap) {
      ・・・
      // TODO: Process the image in TensorFlow here.
     
      // 元画像データを TensorFlow 側にコピー
      inferenceInterface.feed(INPUT_NODE, floatValues,
                1, bitmap.getWidth(), bitmap.getHeight(), 3);
      // 適用するスタイルの画像データを、TensorFlow 側にコピー
      inferenceInterface.feed(STYLE_NODE, styleVals, NUM_STYLES);
     
      // TensorFlow 側で、元画像データにスタイルを適用する処理を実行
      inferenceInterface.run(new String[] {OUTPUT_NODE}, isDebug());
    
      // 適用結果を取得して配列に保存
      inferenceInterface.fetch(OUTPUT_NODE, floatValues);
     
      ・・・
    

    最後に、デバッグ用の出力を行う「renderDebug」メソッドにも一部コードを追加します。
    こちらは必ずしも必要、というわけではないと思うのですが、ついでとして。
    すでに実装されている、

    final Vector<String> lines = new Vector<>();
    

    というコードのすぐ下に、以下のようにコードを追加します。

    private void renderDebug(final Canvas canvas) {
    ・・・
      final Vector<String> lines = new Vector<>();
    
      // TensorFlow の処理に関するデバッグ情報を出力
      final String[] statLines = inferenceInterface.getStatString().split("\n");
      Collections.addAll(lines, statLines);
      lines.add("");
    
      ・・・
    

    これでコードの追加は完了です。
    なお、このアプリ、サンプルのアプリの割には、けっこういろいろなことをやっているようで、「StylizeActivity」クラスにもともと実装されているコードは、意外とボリュームがあります。
    TensorFlow の機能を呼び出す部分は、数行のコード追加で行えるようですが、それをアプリとしてきれいに実装するには、やはり相応の量のコードを記述する必要があるようです。

    8.アプリを実行する

    作成したアプリを Android 実機で実行します。
    画面上側のカメラプレビューに映るイメージが、画面下側で選択したスタイルが適用された形で、リアルタイムに描画されます。

    今回、その効果を検証しましたのは、私が毎週のようにお世話になっている、たも屋林店 の「ひやかけ1.5玉とちくわの天ぷら(わかめトッピング無料) 380円」。
    ここ最近は気温も高くなり、ひやかけのおいしい季節になってまいりました。
     
     f:id:IntelligentTechnology:20170523133727j:plain:w400

    これを、今回作成したアプリを通して見ると、以下のように表示されました。
     
     f:id:IntelligentTechnology:20170523133755p:plain:w400

    まったくおいしくなさそうな絵になっていますが、スタイルの適用、という面では想定通りの結果となっているようです。

    9.まとめ

    TensorFlow に関しては、どちらかというと「学習させる」ほうがメインであるようにも思えますが、しかし、これまでは、学習済みのモデルをアプリから利用するだけでも、一手間かかっていた、という状況でした。
    今回の Codelab のアプリ作成で、TensorFlow をマスターできる、というわけではまったくありませんけれども、それでも、build.gradle に 1行追加するだけで、容易に TensorFlow の機能を呼び出せる、というのを体験できる、という意味では、試しがいがあるではないか、と感じました。


              Google Cloud Vision API ハンズオン 参加してきました。        

    中山です。

    3/20(日)、香川県高松市の「e-とぴあ・かがわ」で開催されました、「Google Cloud Vision API ハンズオン」イベントに参加してまいりました。

    gdgshikoku.connpass.com

    しかしこのときはまだ、Googleさんがそんな情報まで把握している、というのは夢にも思わなかったのです・・・。


    このハンズオンは、「e-とぴあ文化祭2016」の中のひとつのイベントとして開催されていました。

    f:id:IntelligentTechnology:20160322165858j:plain:w400

    Google Cloud Vision API とは

    Google Cloud Vision API は、Google社が提供する、画像解析のためのAPIです。
    以下、ハンズオンのサイトから引用します。

    指定されている画像に何が写っているのか、人の場合はその表情が笑っているのか怒っているのか、さらに有名なランドマークが写っていればそれがどこの何かを教えてくれます。
    またOCRのように文字を読み取ることも可能です。

    Google Cloud Vision API自体は有料ですが、今回のハンズオンでは、無料トライアルの範囲で、機能を試すことができました。
    無料トライアルでは、60日間、300ドルの範囲で、機能を試すことができます。また、1ヶ月に1000リクエスト程度まででしたら、料金が発生することなく利用できるようです。

    Google Cloud Vision API を試す

    ハンズオン自体は、Google Cloud Vision APIの「Getting Started」ページの内容にしたがって進められました。
    APIキーなど、各種の設定を行っておく必要がありますが、これらについては、公式サイトの説明にゆずることといたします。

    Google Cloud Vision の API としては、実は1種類しかありません。
    Macを使う場合、いちばんシンプルに試そうとすると、以下のような手順になります。

    1.まずは解析対象の画像ファイルを用意して、

    http://people.ucsc.edu/~kamacdon/bulldogpuppy1.jpg
    http://people.ucsc.edu/~kamacdon/

    2.それをbase64変換して、

    f:id:IntelligentTechnology:20160322171525p:plain

    3.そのbase64変換した文字列を、以下のようなJSONのデータに埋め込んで、

    f:id:IntelligentTechnology:20160322171708p:plain

    4.そのJSONデータファイルを、Google Cloud Vision API のサーバに送信すると、

    f:id:IntelligentTechnology:20160322172342p:plain

    5.サーバから以下のような解析結果が返されます。

    f:id:IntelligentTechnology:20160322172637p:plain

    今回の場合は、0.99以上のスコアで「dog」である、と判定されたようです。

    いろいろ試す

    上記のJSONデータファイルに指定する、画像解析の「type」としては、ほかにも以下のようなものが指定できるそうです。

    • FACE_DETECTION: 顔認識
    • LANDMARK_DETECTION: 建物などの画像から、その位置情報を特定
    • LOGO_DETECTION: 企業ロゴ、ブランドロゴの検出
    • LABEL_DETECTION: カテゴリ検出、画像種類判定(上記で使っていたもの)
    • TEXT_DETECTION: 文字認識 OCR機能
    • SAFE_SEARCH_DETECTION: 不適切画像の検出
    • IMAGE_PROPERTIES: 画像の属性情報抽出(画像の主要な色の検出など)

    以下いろいろと試してみました。

    画像種類判定

    四国新聞でも連載中の大人気4コマ漫画、かまタマくんの画像は、

    http://pbs.twimg.com/media/CTsOGKvUkAAGD66.jpg
    http://sp.shikoku-np.co.jp/comic/archive.htm

    f:id:IntelligentTechnology:20160322173424p:plain

    解析結果の件数を5件に増やしてみたものの「book」とか「area」とか、なにやらあいまいな結果が。
    f:id:IntelligentTechnology:20160322173506p:plain

    OCR機能

    しかし、おなじかまタマくんの画像でOCR機能を呼び出してみると、
    http://pbs.twimg.com/media/CTsOGKvUkAAGD66.jpg
    http://sp.shikoku-np.co.jp/comic/archive.htm

    f:id:IntelligentTechnology:20160322173748p:plain

    かなり正確に文字を拾ってる!

    f:id:IntelligentTechnology:20160322173812p:plain

    位置情報特定

    香川県にある「がもううどん」の写真で「LANDMARK_DETECTION」の機能を呼び出してみると、

    http://www.sanuki-udon.net/images/udon/R0063793.jpg
    http://www.sanuki-udon.net/2009/12/post_237.html

    f:id:IntelligentTechnology:20160322174300p:plain

    以下のように、緯度経度の情報が返ってきました。(descriptionは「Udon」。)

    f:id:IntelligentTechnology:20160322174403p:plain

    ひょっとして、と思い、その緯度経度を Google Mapsで検索すると・・・、

    f:id:IntelligentTechnology:20160322174512p:plain

    ほんとに「がもううどん」のお店の場所が表示されました。Googleさん、ちょっと怖い!

    顔認識

    宇宙飛行士の集合写真で顔認識を試したところ、

    http://blog-imgs-52.fc2.com/1/0/r/10rank/nasa_convert_20121023004955.jpeg
    http://10rank.blog.fc2.com/blog-entry-90.html

    f:id:IntelligentTechnology:20160322174954p:plain

    認識した顔一つ一つにたいして、その座標などの情報が返されました。

    f:id:IntelligentTechnology:20160322175038p:plain

    不適切画像の検出

    海外ドラマ「ウォーキングデッド」に登場するゾンビの画像が「不適切画像」にあたるかどうかを試したところ、

    f:id:IntelligentTechnology:20160322175246p:plain

    「VERY_UNLIKELY」とか「UNLIKELY」ばかりでしたので、そんなに「有害」ではない?「violence」が「UNLIKELY」なので、少しスコアが高いか?(画像自体は、恐ろしい写真なので、ここへの掲載は見送りますけど。)

    f:id:IntelligentTechnology:20160322175335p:plain

    まとめ

    けっこう簡単に、画像認識の機能を試すことができました。
    Google Cloud Vision API 自体は人工知能ライブラリである「TensorFlow」の機能を利用している、とのことです。

    建物の画像から位置情報がわかってしまうのは、ちょっと怖い気もしましたが、これも昨今の人工知能の発達の成果のひとつなのでしょう。
    シンプルなAPIなので、応用の幅も広いのではないか、と感じました。


              2017年最值得关注的5家深度学习创业公司青海省网上家长学校3        
    2016年,人工智能成为了主流。谷歌(微博)CEO桑达尔·çš®æŸ¥ä¼Šï¼ˆSundar Pichai)甚至认为,科技行业将从“移动优先”转向“人工智能优先”。 苹果将人工智能技术用于iPhone,谷歌则将这类技术用于Pixel手机。Facebook的消息流中有人工智能的身影,而微软Word也用到了人工智能。三星收购了人工智能创业公司Viv,以追赶苹果的Siri。Skype和Messenger等消息应用也已集成了人工智能聊天机器人。 在人工智能领域,很大一部分研究集中于深度学习。深度学习技术利用大量数据去训练人工神经网络,让这样的神经网络有能力处理新的数据。过去5年中,越来越多深度学习创业公司正在崛起。 2016年,芯片巨头英特尔收购了深度学习硬件和软件开发商Nervana,而企业软件公司Salesforce则收购了MetaMind,后者开发的深度学习软件能迅速处理大量图片和文字。Nervana和MetaMind都曾登上过VentureBeat的2015年最值得关注深度学习创业公司榜单。与此同时,登上2016年榜单的创业公司都在全速前进。 以下是2017年最值得关注的5家深度学习创业公司: 1、Bay Labs 多家创业公司正试图用深度学习技术去处理医学影像,而Bay Labs是其中之一。该公司的团队中有多名工程人才,包括此前从事谷歌Project Loon项目的约翰·æ¢…斯(Johan Mathe)。Facebook人工智能研究集团总监延恩·å‹’昆(Yann LeCun)已投资了这家创业公司,而其他投资方还包括Khosla Ventures。 2、Cerebras Systems Cerebras是一家神秘的创业公司,领头人是安德鲁·è´¹å°”德曼(Andrew Feldman)。他此前将自己的小型服务器公司SeaMicro以3.34亿美元的价格出售给了AMD。费尔德曼新的创业公司开发人工智能硬件。根据消息人士的说法,知名风投Benchmark领投了该公司一轮超过2000万美元的投资。费尔德曼拒绝对此置评。 3、Deep Vision Deep Vision来自加州帕洛阿尔托,开发低功耗芯片,用于深度学习。该公司的两名联合创始人雷汉·å“ˆå¯†å¾·ï¼ˆRehan Hammed)和瓦杰哈特·å¡è¿ªå°”(Wajahat Qadeer)在斯坦福大学求学期间写出了一篇有趣的论文,主题是“卷积引擎芯片多处理器”。 4、Graphcore Graphcore开发智能处理单元(IPU)PCIe加速器。神经网络可以使用这样的加速器去训练或推理。这家创业公司也在开发软件,使现有的MXNet和TensorFlow深度学习框架支持其基础设施。投资方包括Bosch Venture Capital、Foundation Capital,以及三星Catalyst Fund。 5、ViSenze ViSenze创立于2012年。在2016年的ImageNet图像识别大赛中,该公司在某些环节中胜过了竞争对手。该公司的投资方包括乐天风投。ViSenze分拆自NExT,一个由新加坡国立大学和清华大学成立的研究中心。该公司的软件能完成图片视频中的对象识别和标记,提供在视觉上相似的内容。      
              (cclxxiii) metacpan weekly report - Mojolicious & Moxie        
    This is the weekly favourites list of CPAN distributions. Votes count: 69
    Week's winners (+3): Mojolicious, Moxie
    Build date: 2017/07/16 16:00:59 GMT

    Clicked for first time:

    Increasing its reputation:

              Google releases TensorFlow Serving library        

    Google has just moved to a production release of TensorFlow Serving, its open source library for serving machine-learned models in production environments. A beta version of the technology was released in February.

    Part of Google’s TensorFlow machine intelligence project, the TensorFlow Serving 1.0 library is intended to aid the deployment of algorithms and experiments while maintaining the same server architecture and APIs. TensoFlow Serving lets you push out multiple versions of models over time, as well as roll them back.

    The library of course integrates with TensorFlow learning models, but it can also be extended to serve other model types.

    To read this article in full or to leave a comment, please click here


              IBM speeds deep learning by using multiple servers        

    For everyone frustrated by how long it takes to train deep learning models, IBM has some good news: It has unveiled a way to automatically split deep-learning training jobs across multiple physical servers -- not just individual GPUs, but whole systems with their own separate sets of GPUs.

    Now the bad news: It's available only in IBM's PowerAI 4.0 software package, which runs exclusively on IBM's own OpenPower hardware systems.

    Distributed Deep Learning (DDL) doesn't require developers to learn an entirely new deep learning framework. It repackages several common frameworks for machine learning: TensorFlow, Torch, Caffe, Chainer, and Theano. Deep learning projecs that use those frameworks can then run in parallel across multiple hardware nodes.

    To read this article in full or to leave a comment, please click here


              MPI's Place in Big Computing        

    The organizers of EuroMPI 2016 were kind enough to invite me to give a keynote and participate in a panel at their meeting, which was held at the end of September in beautiful Edinburgh. The event was terrific, with lots of very interesting work going on in MPI implementations and with MPI.

    The topic of my talk was “MPI’s Place in Big Computing”; the materials from the talk can be found on github. The talk, as you might expect, included discussion of high-productivity big data frameworks, but also — and missing from the discussion in my “HPC is dying” blog post — the “data layer” frameworks that underpin them.

    I think a lot of people have taken, quite reasonably, my that blog post to suggest that Spark for example is a competitor to MPI; the point I wanted to make is a little more nuanced that that.

    I’m actually skeptical of Spark’s utility for (e.g.) large-scale simulations. However attractive the model is from a variety of points of view, absent some huge breakthrough I don’t think that functional models with immutable data can support the performance, memory requirements, or performance predictability we require. (But who knows; maybe that’ll be one of the compromises we find we have to make on the road to exascale).

    But whatever you might think of Spark’s efficacy for your particular use case,

    • A lot of people manifestly find it to be extremely useful for their use case; and
    • Performance is quite important to those communities.

    So given that, why isn’t Spark built atop of MPI for network communications? And why isn’t TensorFlow, or Dask, or SeaStar?

    The past five years have seen a huge number of high-productivity tools for large-scale number crunching gain extremely rapid adoption. Even if you don’t like those particular tools for your problems, surely you’d like for there to exist some tools like that for the traditional HPC community; why do other communications frameworks support this flourishing ecosystem of platforms, and MPI doesn’t?

    There’s another argument there, too - simply from a self-preservation point of view, it would be in MPI’s interest to be adopted by a high-profile big data platform to ensure continued success and support. But none are; why? It’s not because the developers of Spark or at Google are just too dumb to figure out MPI’s syntax.

    Going through what does get used for these packages and what doesn’t — which is what I do in this talk — I think the issues become fairly clear. MPI wants to be both a low-level communications framework and a higher-level programming model, and ends up tripping over it’s own feet trying to dance both dances. As a communications “data plane” it imposes too many high-level decisions on applications — no fault tolerance, restrictive communications semantics (in-order and arrival guarantees), and provides too few services (e.g. a performant active message/RPC layer). And as a high-level programming model it is too low level and is missing different services (communications-aware scheduling came up in several guises at the meeting).

    I don’t think that’s insurmountable; I think inside MPI implementations there is a performant, network-agnostic low-level communications layer trying to get out. Exposing more MPI runtime services is a move in the right direction. I was surprised at how open the meeting participants were to making judicious changes — even perhaps breaking some backwards compatability — in the right directions.

    Thanks again to the organizers for extending the opportunity to participate; it was great.

    My slides can be seen below or on github, where the complete materials can be found.


              Spark, Chapel, TensorFlow: Workshop at UMich        

    The kind folks at the University of Michigan’s Center for Computational Discovery and Engineering (MICDE), which is just part of the very impressive Advanced Research Computing division, invited me to give a workshop there a couple of months ago about the rapidly-evolving large-scale numerical computing ecosystem.

    There’s lots that I want to do to extend this to a half-day length, but the workshop materials — including a VM that can be used to play with Spark, Chapel and TensorFlow, along with Jupyter notebooks for each — can be found on GitHub and may be of some use to others as they stand.

    The title and abstract follow.

    Next Generation HPC? What Spark, TensorFlow, and Chapel are teaching us about large-scale numerical computing

    For years, the academic science and engineering community was almost alone in pursuing very large-scale numerical computing, and MPI - the 1990s-era message passing library - was the lingua franca for such work. But starting in the mid-2000s, others became interesting in large-scale computing on data. First internet-scale companies like Google and Yahoo! started performing fairly basic analytics tasks at enormous scale, and now many others are tackling increasingly complex and data-heavy machine-learning computations, which involve very familiar scientific computing tasks such as linear algebra, unstructured mesh decomposition, and numerical optimization. But these new communities have created programming environments which emphasize what we’ve learned about computer science and programmability since 1994 - with greater levels of abstraction and encapsulation, separating high-level computation from the low-level implementation details, and some in HPC are starting to notice. This talk will give a brief introduction to Apache Spark environment and Google’s Tensor Flow machine-learning package for high-level numerical computation, as well as the HPC-focused Chapel language from Cray, to show where each can be used today and how they might be used in the future. The slides for this talk, and examples for each package along with a virtual machine which can be used for running them, will be available at https://github.com/ljdursi/Spark-Chapel-TF-UMich-2016 .


              Comment on Google’s TensorFlow Bet May Be Starting To Pay Off (GOOGL) by Jack Smith        
    TensorFlow (TF) on GitHub has over 63k stars! It is adding stars at a rate of 8x the next best AI framework. TF is already the canonical AI framework. Tutorials, podcasts, taught at University, etc. The AI framework is basically owning the AI OS. Honestly the battle is already over and Google won.
              Digitalia #377 - Scoiattoli e Siringhe        

    Glicemia e sensori avanzati su smartwatch. Condannati per un emoji. Internet è rotto, Ev Williams ha il rimedio. Blue Whale Challenge. Le novità di Google I/O. Queste e molte altre le notizie commentate nella puntata di questa settimana.

    Dallo studio distribuito di digitalia:
    Franco Solerio, Massimo De Santo, Michele Di Maio, Francesco Facconi

    Produttori esecutivi:
    Andrea Faino, Ekaterina Zakaryukina, Giorgio Beggiora, Enrico Carangi, Fulvio Barizzone, Gaetano Cimmino, Massimiliano Saggia, Nicola Bisceglie, Giorgio Puglisi, Umberto Marcello, Giacomo Cipriani, Roberto Roccatello, David Padovani, Fabio Brunelli, Matteo Ottone, Alessandro Grossi, Emanuele Zdunich, Marco Santonocito, Andrea Favaretto, Gian Paolo Boarina, Roberto Nespoli, Marco Traverso, Marco Barabino, Gianni Stanghellini (Walk About Tuscany), Claudio Meloni, Roberto Esposito, Daniele Tomasoni, Carlotta Cubeddu, Mauro Tommasi, Donato Gravino, Emanuele Libori, Giancarlo Merlin, Marco Giorgetti, Mattia Garbarini, Andrea Delise, Alessandro Lago, Roberto Terranova, Lorenzo Bernabo', Sascha Navarra, Marcello Marigliano

    Sponsor:
    Squarespace.com - utilizzate il codice coupon "DIGITALIA" per avere il 10% di sconto sull'abbonamento annuale.

    Links: Gingilli del giorno:
              How machine learning in G Suite makes people more productive        

    Email management, formatting documents, creating expense reports. These are just some of the time-sinks that can affect your productivity at work. At Google, this is referred to as “overhead”—time spent working on tasks that do not directly relate to creative output—and it happens a lot.

    According to a Google study in 2015, the average worker spends only about 5 percent of his or her time actually coming up with the next big idea. The rest of our time is caught in the quicksand of formatting, tracking, analysis or other mundane tasks. That’s where machine learning can help.

    Machine learning algorithms observe examples and make predictions based on data. In G Suite, machine learning models make your workday more efficient by taking over menial tasks, like scheduling meetings, or by predicting information you might need and surfacing it for you, like suggesting Docs.

    Time spent chart

    Source: Google Data, April 2015

    Eliminating spam within Gmail using machine learning

    One of the earliest machine learning use cases for G Suite was within Gmail. Historically, Gmail used a rule-based system, which meant our anti-spam team would create new rules to match individual spam patterns. Over a decade of using this process, we improved spam detection accuracy to 99 percent.

    Starting in 2014, our team augmented this rule-based system to generate rules using machine learning algorithms instead, taking spam detection one step further. Now, we use TensorFlow and other machine learning to continually regenerate the “spam filter,” so the system has learned to predict which emails are most likely junk. Machine learning finds new patterns and adapts far quicker than previous manual systems—it’s a big part of the reason that more than one billion Gmail users avoid spam within their account.

    See machine learning in your favorite G Suite apps

    G Suite’s goal is to help teams accomplish more with its intelligent apps, no matter where they are in the world. And chances are, you’ve already seen machine learning integrated into your day-to-day work to do just that.

    Smart Reply, for example, uses machine learning to generate three natural language responses to an email. So if you find yourself on the road or pressed for time and in need of a quick way to clear your inbox, let Smart Reply do it for you.
    Smart Reply GIF

    Explore in Docs, Slides and Sheets uses machine learning to eliminate time spent on mundane tasks, like tracking down documents or information on the web, reformatting presentations or performing calculations within spreadsheets.

    Explore

    Quick Access in Drive predicts and suggests files you might need within Drive. Using machine intelligence, Quick Access can predict files based on who you share files with frequently, when relevant meetings occur within your Calendar or if you tend to use files at certain times of the day.

    Quick Access

    To learn more about how machine intelligence can make your life easier, sign up for this free webinar on June 15, 2017, featuring experts from MIT Research, Google and other companies. You can also check out the Big Data and Machine Learning blog or watch this video from Google Cloud Next with Ryan Tabone, director of product management at Google, where he explains more about “overhead.”


              tshimizu8        
    [参加予定] 清水です。Bluetoothキーボード希望です。 Google TensorFlowの話をします。畳込みニューラルネットワークが、。。。 何かもっていきます。
               Solving ill-posed inverse problems using iterative deep neural networks / Jobs: 2 Postdocs @ KTH, Sweden - implementation -        
    Ozan just sent me the following e-mail. It has the right mix of elements of The Great Convergence by applying learning to learn methods to inverse problems that are some of the problems we thought compressive sensing could solve well (CT tomography), papers suporting those results, an implementation, a blog entry and two postdoc jobs. Awesome !
    Dear Igor,


    I have for some time followed your excellent blog Nuit Blanche. I'm not familiar with how you select entries for Nuit Blanche, but let me take the opportunity to provide potential input for related to Nuit Blanche on the exciting research we pursue at the Department of Mathematics, KTH Royal Institute of Technology. If you find any of this interesting, please feel free to post it on Nuit Blanche.


    1. Deep learning and tomographic image reconstruction
    The main objective for the research is to develop theory and algorithms for 3D tomographic reconstruction. An important recent development has been to use techniques from deep learning to solve inverse problems. We have developed a rather generic, yet adaptable, framework that combines elements of variational regularization with machine learning for solving large scale inverse problems. More precisely, the idea is to learn a reconstruction scheme by making use of the forward operator, noise model and other a priori information. This goes beyond learning a denoiser where one first performs an initial (non machine-learning) reconstruction and then uses machine learning on the resulting image-to-image (denoising) problem. Several groups have done learning a denoiser and the results are in fact quite remarkable, outperforming previous state of the art methods. Our approach however combines reconstruction and denoising steps which further improves the results. The following two arXiv-reports http://arxiv.org/abs/1707.06474 and http://arxiv.org/abs/1704.04058 provide more details, there is also a blog-post at http://adler-j.github.io/2017/07/21/Learning-to-reconstruct.html by one of our PhD students that explains this idea of "learning to reconstruct".


    2. Post doctoral fellowships
    I'm looking for two 2-year post-doctoral fellowships, one dealing with regularization of spatiotemporal and/or multichannel images and the other with methods for combining elements of variational regularization with deep learning for solving inverse problems. The announcements are given below. I would be glad if you could post these also on your blog.


    Postdoctoral fellow in PET/SPECT Image Reconstruction (S-2017-1166)
    Deadline: December 1, 2017
    Brief description:
    The position includes research & development of algorithms for PET and SPECT image reconstruction. Work is closely related to on-going research on (a) multi-channel regularization for PET/CT and SPECT/CT imaging, (b) joint reconstruction and image matching for spatio-temporal pulmonary PET/CT and cardiac SPECT/CT imaging, and (c) task-based reconstruction by iterative deep neural networks. An important part is to integrate routines for forward and backprojection from reconstruction packages like STIR and EMrecon for PET and NiftyRec for SPECT with ODL (http://github.com/odlgroup/odl), our Python based framework for reconstruction. Part of the research may include industrial (Elekta and Philips Healthcare) and clinical (Karolinska University Hospital) collaboration.
    Announcement & instructions:
    http://www.kth.se/en/om/work-at-kth/lediga-jobb/what:job/jobID:158920/type:job/where:4/apply:1

    Postdoctoral fellow in Image Reconstruction/Deep Dictionary Learning (S-2017-1165)
    Deadline: December 1, 2017
    Brief description:

    The position includes research & development of theory and algorithms that combine methods from machine learning with sparse signal processing for joint dictionary design and image reconstruction in tomography. A key element is to design dictionaries that not only yield sparse representation, but also contain discriminative information. Methods will be implemented in ODL (http://github.com/odlgroup/odl), our Python based framework for reconstruction which enables one to utilize the existing integration between ODL and TensorFlow. The research is part of a larger effort that aims to combine elements of variational regularization with machine learning for solving large scale inverse problems, see the arXiv-reports http://arxiv.org/abs/1707.06474 and http://arxiv.org/abs/1704.04058 and the blog-post at http://adler-j.github.io/2017/07/21/Learning-to-reconstruct.html for further details. Part of the research may include industrial (Elekta and Philips Healthcare) and clinical (Karolinska University Hospital) collaboration.Announcement & instructions:
    http://www.kth.se/en/om/work-at-kth/lediga-jobb/what:job/jobID:158923/type:job/where:4/apply:1




    Best regards,
    Ozan


    --

    Assoc. Prof. Ozan Öktem
    Director, KTH Life Science Technology Platform
    Web: http://ww.kth.se/lifescience


    Department of Matematics
    KTH Royal Institute of Technology
    SE-100 44 Stockholm, Sweden
    E-mail: ozan@kth.se




    Learned Primal-dual Reconstruction by Jonas Adler, Ozan Öktem

    We propose a Learned Primal-Dual algorithm for tomographic reconstruction. The algorithm includes the (possibly non-linear) forward operator in a deep neural network inspired by unrolled proximal primal-dual optimization methods, but where the proximal operators have been replaced with convolutional neural networks. The algorithm is trained end-to-end, working directly from raw measured data and does not depend on any initial reconstruction such as FBP.
    We evaluate the algorithm on low dose CT reconstruction using both analytic and human phantoms against classical reconstruction given by FBP and TV regularized reconstruction as well as deep learning based post-processing of a FBP reconstruction.
    For the analytic data we demonstrate PSNR improvements of >10 dB when compared to both TV reconstruction and learned post-processing. For the human phantom we demonstrate a 6.6 dB improvement compared to TV and a 2.2 dB improvement as compared to learned post-processing. The proposed algorithm also improves upon the compared algorithms with respect to the SSIM and the evaluation time is approximately 600 ms for a 512 x 512 pixel dataset.  

    Solving ill-posed inverse problems using iterative deep neural networks by Jonas Adler, Ozan Öktem
    We propose a partially learned approach for the solution of ill posed inverse problems with not necessarily linear forward operators. The method builds on ideas from classical regularization theory and recent advances in deep learning to perform learning while making use of prior information about the inverse problem encoded in the forward operator, noise model and a regularizing functional. The method results in a gradient-like iterative scheme, where the "gradient" component is learned using a convolutional network that includes the gradients of the data discrepancy and regularizer as input in each iteration. We present results of such a partially learned gradient scheme on a non-linear tomographic inversion problem with simulated data from both the Sheep-Logan phantom as well as a head CT. The outcome is compared against FBP and TV reconstruction and the proposed method provides a 5.4 dB PSNR improvement over the TV reconstruction while being significantly faster, giving reconstructions of 512 x 512 volumes in about 0.4 seconds using a single GPU.
    An implementation is here: https://github.com/adler-j/learned_gradient_tomography
     
     
    Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
    Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

              Densely Connected Convolutional Networks - implementations -        

    Densely Connected Convolutional Networks by Gao Huang, Zhuang Liu, Kilian Q. Weinberger, Laurens van der Maaten

    Recent work has shown that convolutional networks can be substantially deeper, more accurate, and efficient to train if they contain shorter connections between layers close to the input and those close to the output. In this paper, we embrace this observation and introduce the Dense Convolutional Network (DenseNet), which connects each layer to every other layer in a feed-forward fashion. Whereas traditional convolutional networks with L layers have L connections - one between each layer and its subsequent layer - our network has L(L+1)/2 direct connections. For each layer, the feature-maps of all preceding layers are used as inputs, and its own feature-maps are used as inputs into all subsequent layers. DenseNets have several compelling advantages: they alleviate the vanishing-gradient problem, strengthen feature propagation, encourage feature reuse, and substantially reduce the number of parameters. We evaluate our proposed architecture on four highly competitive object recognition benchmark tasks (CIFAR-10, CIFAR-100, SVHN, and ImageNet). DenseNets obtain significant improvements over the state-of-the-art on most of them, whilst requiring less memory and computation to achieve high performance. Code and models are available at this https URL .


    From the main implementation page at: https://github.com/liuzhuang13/DenseNet

    "..Other Implementations
    1. Our Caffe Implementation
    2. Our (much more) space-efficient Caffe Implementation.
    3. PyTorch Implementation (with BC structure) by Andreas Veit.
    4. PyTorch Implementation (with BC structure) by Brandon Amos.
    5. MXNet Implementation by Nicatio.
    6. MXNet Implementation (supporting ImageNet) by Xiong Lin.
    7. Tensorflow Implementation by Yixuan Li.
    8. Tensorflow Implementation by Laurent Mazare.
    9. Tensorflow Implementation (with BC structure) by Illarion Khlestov.
    10. Lasagne Implementation by Jan Schlüter.
    11. Keras Implementation by tdeboissiere.
    12. Keras Implementation by Roberto de Moura Estevão Filho.
    13. Keras Implementation (with BC structure) by Somshubra Majumdar.
    14. Chainer Implementation by Toshinori Hanya.
    15. Chainer Implementation by Yasunori Kudo.
    16. Fully Convolutional DenseNets for segmentation by Simon Jegou...."


    Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
    Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

              Slides: Deep Learning and Reinforcement Learning Summer School 2017 @ MILA Montreal, Canada        
    The Deep Learning and Reinforcement Learning Summer School 2017 just finished and here are some of the slides presented there (videos should be coming later) 



    Join the CompressiveSensing subreddit or the Google+ Community or the Facebook page and post there !
    Liked this entry ? subscribe to Nuit Blanche's feed, there's more where that came from. You can also subscribe to Nuit Blanche by Email, explore the Big Picture in Compressive Sensing or the Matrix Factorization Jungle and join the conversations on compressive sensing, advanced matrix factorization and calibration issues on Linkedin.

              è‹å‰‘æž—        
    升级tensorflow到最新版本看看
              #346: JS oddities, V8 6.1, and an online VS Code IDE for JavaScript        
    This week's JavaScript news — Read this e-mail on the Web
    JavaScript Weekly
    Issue 346 — August 4, 2017
    Around 40 examples of ‘quirky’ JavaScript code with unexpected results or outcomes. Mostly interesting to learn about odd edge cases.
    Denys Dovhan

    Get the VS Code experience in your browser. Introductory post here.
    Eric Simons

    In beta until the release of Chrome 61, 6.1 has a smaller binary, includes some significant performance improvements when iterating over maps and sets, and asm.js code is now transpiled to WebAssembly.
    Mathias Bynens

    Sencha, Inc
    Multi-directional scrolling with a fixed header, paging, grouping and editing data in cells are just a few of the capabilities of the ExtReact grid. Try ExtReact for free to see how easy it is to add the grid and many other components into your apps.
    Sencha, Inc   Sponsor

    Nuxt.js is a framework for bringing server-side rendering (SSR) to your Vue.js apps, similar to how Next.js does with React.
    Olayinka Omole

    Adopting a more functional approach let the author stop using bits of JavaScript he didn’t like.
    Joel Thoms

    A practical introduction to service workers (scripts that run in the background separate from a Web page context) and how to easily create one using Ember.
    Adnan Chowdhury

    Jobs Supported by Hired.com

    Can't find the right job? Want companies to apply to you? Try Hired.com.

    In Brief

    TypeScript's Type System is Turing Complete news
    Henning Dieterichs

    webpack Awarded $125,000 By Mozilla news
    To implement first-class support for WebAssembly.
    Sean T. Larkin

    Register for the Polymer Summit in Copenhagen on 22-23 August news
    Learn more about the talks and workshops at this year's Polymer Summit, and see who our amazing speakers are.
    Google, Inc.  Sponsor

    W3C Launches a WebAssembly Working Group news
    Bradley Nelson

    Machine Learning Comes to Your Browser with JavaScript news
    With a new JS library that runs Google’s TensorFlow in the browser.
    InfoWorld

    A Look at the 'Null Propagation Operator' Proposal tutorial
    Provides an alternative to endless null checks.
    Nicolás Bevacqua

    Creating Custom Inputs with Vue.js tutorial
    Understand how v-model works on native inputs and custom components.
    Joseph Zimmerman

    A Reintroduction to 'this' in JavaScript tutorial
    Zell Liew

    Build your first JavaScript, Android, or iOS app with MongoDB Stitch tutorial
    Get started with the beta release of MongoDB's backend-as-a-service with step-by-step tutorials and sample apps.
    MONGODB  Sponsor

    How the Proposed 'Class Fields' for JavaScript Would Work tutorial
    Dr. Axel Rauschmayer

    D3 in Depth: An Intermediate Guide to Building D3 Visualizations tutorial
    Peter Cook

    JavaScript Riddles for Fun and Profit video
    Poses a series of ever more challenging JavaScript riddles and brain-teasers.
    Dan Shappir

    Use const Until You Have to Use let opinion
    Vince Campanale

    Why We Broke Our Philosophical Vows to Bring You CircleCI 2.0 story
    CircleCI  Sponsor

    Vuestic: A New Vue.js-Powered Admin Dashboard code
    Demo here.
    Epicmax

    Express Gateway: A Microservice API Gateway Built on Express code node

    Turf: A Modular (Geo)Spatial Analysis Engine code
    Morgan Herlocker

    Glamorous v4 Released: CSS Styling for React Components code
    Kent C. Dodds

    ProseMirror: A Toolkit for Building Rich-Text Editors for the Web code
    Marijn Haverbeke

    Curated by Peter Cooper and published by Cooperpress.

    Like this? You may also enjoy: FrontEnd Focus : Node Weekly : React Status

    Stop getting JavaScript Weekly : Change email address : Read this issue on the Web

    © Cooperpress Ltd. Fairfield Enterprise Centre, Lincoln Way, Louth, LN11 0LS, UK


              Google releases TensorFlow Serving library        

    Google has just moved to a production release of TensorFlow Serving, its open source library for serving machine-learned models in production environments. A beta version of the technology was released in February.

    Part of Google’s TensorFlow machine intelligence project, the TensorFlow Serving 1.0 library is intended to aid the deployment of algorithms and experiments while maintaining the same server architecture and APIs. TensoFlow Serving lets you push out multiple versions of models over time, as well as roll them back.

    The library of course integrates with TensorFlow learning models, but it can also be extended to serve other model types.

    To read this article in full or to leave a comment, please click here


              IBM speeds deep learning by using multiple servers        

    For everyone frustrated by how long it takes to train deep learning models, IBM has some good news: It has unveiled a way to automatically split deep-learning training jobs across multiple physical servers -- not just individual GPUs, but whole systems with their own separate sets of GPUs.

    Now the bad news: It's available only in IBM's PowerAI 4.0 software package, which runs exclusively on IBM's own OpenPower hardware systems.

    Distributed Deep Learning (DDL) doesn't require developers to learn an entirely new deep learning framework. It repackages several common frameworks for machine learning: TensorFlow, Torch, Caffe, Chainer, and Theano. Deep learning projecs that use those frameworks can then run in parallel across multiple hardware nodes.

    To read this article in full or to leave a comment, please click here


              TensorFlow – обучение и сходимость        

    Ключевым компонентом большинства систем искусственного интеллекта и машинного обучения является цикл, в процессе которого система улучшается в процессе большого количества итераций обучения. Самый простой способ тренировки таким образом – это использование цикла for, и мы использовали его в статье “TensorFlow – переменные“. Напомним его: import tensorflow as tf x = tf.Variable(0., name='x') model = tf.global_variables_initializer() […]

    Более подробно в TensorFlow – обучение и сходимость на Dev-Ops-Notes.RU.


              TensorFlow – линейные уравнения        

    При помощи функции tf.matrix_solve TensorFlow может решать и системы линейных уравнений (набор связанных функций, записанных, например, следующим образом): \begin{align} 3x+2y=15\tag{1}\\ 4x-y=10\tag{2} \end{align} В принципе, эти уравнения можно решить при помощи целого ряда всевозможных методов, но в этой статье мы посмотрим на то, как для этого использовать tf.matrix_solve в TensorFlow. Пример c обычной прямой Для начала, давайте рассмотрим […]

    Более подробно в TensorFlow – линейные уравнения на Dev-Ops-Notes.RU.


              TensorFlow – чтение файлов        
    TensorFlow - Чтение файлов

    TensorFlow поддерживает чтение больших наборов данных таким образом, чтобы данные никогда не хранились в памяти полностью (было бы не очень хорошо, если бы он имел это ограничение). Есть несколько функций и опций из стандартной библиотеки Python, которые вы можете использовать для решения этой задачи. Более того, TensorFlow поддерживает создание произвольных обработчиков данных, и на это […]

    Более подробно в TensorFlow – чтение файлов на Dev-Ops-Notes.RU.


              Keras и TensorFlow – классификация текста        

    Как вы уже обратили внимание, в последнее время в категории Machine Learning появилось большое количество статей, обучающих использованию TensorFlow. Пока данный раздел наполняется материалом, я решил подогреть ваш интерес к теме машинного обучения и написать статью о практическом применении TensorFlow и Keras для классификации текста по категориям. Описание задачи Недавно ко мне обратились коллеги с […]

    Более подробно в Keras и TensorFlow – классификация текста на Dev-Ops-Notes.RU.


              Tensorflow and why i love it         
    none
               「TensorFlow初心者向けハンズオン@さくらインターネット大阪本社」を開催いたします        
    「TensorFlow初心者向けハンズオン@さくらインターネット大阪本社」を開催いたします
              20万、50万、100万的算法工程师,到底有什么区别?        

    公元七世纪,在车迟国国家气象局组织的一次求雨活动中,虎力、鹿力、羊力三位大仙成功地祈下甘霖,于水火中救了黎民。老国王虽然不明就里,却从此尊他们为国师,奉道教为圭臬。

    本世纪,算法工程师们的境遇也差不多:早些年,信奉糙快猛主义的大佬们觉得他们饱食终日、无所用心,没工作只好在学校混博士,靠数据上的障眼法装神弄鬼。可是,随着去年 AlphaGo 大破李世石,大佬们在心底喊出“我操”的同时,慌不择路地把各种搞劫持、送外卖的生意包装成人工智能,并纷纷请来几位懂算法的国师加持。虽然他们对国师们所做的事智商上并不理解,却虔诚地希望他们快点儿求下雨来。

    于是,算法工程师的身价也水涨船高了。各门派工程师不论过去练的是 java、php 还是 excel,都放弃了最好语言的争论,抄起了深度学习,发誓重新修炼成算法工程师。前些天,还有人在知乎上问我:20 万、50 万、100 万的算法工程师,到底有什么区别?

    这样充满铜臭味儿的问题,让我十分欣慰。虽说在北京,20 万已经基本不可能招到靠谱儿的算法工程师了,还是姑且用上面的数字做个参照,谈谈算法工程师的三个层次吧。(这里说的算法,并不是计算机系本科课程《算法与数据结构》里那个算法。那门课里讲的,是排序、查找这类"确定性算法";而这里我们说的,是用统计方法对数据进行建模的"概率性算法"。)下文中会提到一些算法和模型,但不过是为了举例说明概念,无需深究,有兴趣钻研的朋友可以自己查阅资料。

    第一层次"Operating":会使用工具

    这个层次的工程师,对常用的模型比较熟悉,来了数据以后,好歹能挑个合适的跑一下。

    达到这个层次,其实门槛不高。早些年,您只要掌握了什么叫 LDA、哪叫 SVM,再玩过几次 libnear、mahout 等开源工具,就可以拿到数据后跑个结果出来。到了深度学习时代,这件事儿似乎就更简单了:管它什么问题,不都是拿神经网络往上堆嘛!最近,经常会遇到一些工程师,成功地跑通了 Tensorflow 的 demo 后,兴高采烈地欢呼:我学会深度学习了,我明天就统治人类了!

    这事要真这么简单,我是茄子。任凭你十八般开源工具用的再熟,也不可能搞出个战胜柯洁的机器人来。这里要给大家狠狠浇上一盆冷水:进入这个领域的人,都要先了解一个“没有免费的午餐定理”,这个定理的数学表达过于晦涩,我们把它翻译成并不太准确的文艺语言:

    如果有两个模型搞一次多回合的比武,每个回合用的数据集不同,而且数据集没什么偏向性,那么最后的结果,十有八九是双方打平。 

    管你是普通模型、文艺模型还是 2B 模型,谁也别瞧不起谁。考虑一种极端情况:有一个参赛模型是“随机猜测”,也就是无根据地胡乱给个答案,结果如何呢?对,还是打平!所以,请再也不要问“聚类用什么算法效果好”这样的傻问题了。

    这就很尴尬了!因为掌握了一堆模型并且会跑,其实并没有什么卵用。当然,实际问题的数据分布,总是有一定特点的,比方说人脸识别,图中间怎么说都得有个大圆饼。因此,问“人脸识别用什么模型好”这样的问题,就有意义了。而算法工程师的真正价值,就是洞察问题的数据先验特点,把他们表达在模型中,而这个,就需要下一个层次的能力了。

    会使用工具,在算法工程师中仅仅是入门水平,靠这两把刷子解决问题,就好比杀过两只鸡就想做腹腔手术一样,不靠谱儿程度相当高。如果不是在薪酬膨胀严重的互联网界,我觉得 20 万是个比较合理的价格。

    第二层次"Optimization":能改造模型

    这个层次的工程师,能够根据具体问题的数据特点对模型进行改造,并采用相应合适的最优化算法,以追求最好的效果。

    不论前人的模型怎么美妙,都是基于当时观察到的数据先验特点设计的。比如说 LDA,就是在语料质量不高的情况下,在 PLSA 基础上引入贝叶斯估计,以获得更加稳健的主题。虽说用 LDA 不会大错,但是要在你的具体问题上跑出最好的效果,根据数据特点做模型上的精准改造,是不可避免的。

    互联网数据这一现象更加明显,因为没有哪两家公司拥有的数据是相似的。百度的点击率模型,有数十亿的特征,大规模的定制计算集群,独特的深度神经网络结构,你能抄么?抄过来也没用。用教科书上的模型不变应万变,结果只能是刻舟求剑。

    改造模型的能力,就不是用几个开源工具那么简单了,这需要有两方面的素养:

    一、深入了解机器学习的原理和组件。机器学习领域,有很多看似不那么直接有用的基础原理和组件。比方说,正则化怎么做?什么时候应该选择什么样的基本分布?(如下表) è´å¶æ–¯å…ˆéªŒè¯¥æ€Žä¹ˆè®¾ï¼Ÿä¸¤ä¸ªæ¦‚率分布的距离怎么算?当你看到前辈高人把这些材料烹调在一起,变成 LDA、CNN 这些成品菜肴端上来的时候,也要想想如果自己下厨,是否了解食材,会不会选择和搭配。仅仅会吃几个菜,说出什么味道,离好厨师差的还远着呢。

    二、熟练掌握最优化方法。机器学习从业者不懂最优化,相当于武术家只会耍套路。这就跟雷公太极和闫芳大师一样,实战起来一定是鼻青脸肿。管你设计了一个多牛逼的模型,如果无法在有限的计算资源下找出最优解,那么不过是个花瓶罢了。

    最优化,是机器学习最、最、最重要的基础。你要知道,在目标函数及其导数的各种情形下,应该如何选择优化方法;各种方法的时间空间复杂度、收敛性如何;还要知道怎样构造目标函数,才便于用凸优化或其他框架来求解。而这些方面的训练,要比机器学习的模型还要扎实才行。

    拿大家以为"以不变应万变"的深度学习举个例子。用神经网络处理语音识别、自然语言处理这种时间序列数据的建模,RNN(见上图)是个自然的选择。不过在实践中,大家发现由于“梯度消失”现象的存在,RNN 很难对长程的上下文依赖建模。而在自然语言中,例如决定下面的 be 动词是“is”还是“are”这样的问题,有可能往前翻好多词才能找到起决定作用的主语。怎么办呢?天才的J. Schmidhuber 设计了带有门结构的 LSTM 模型(见下图),让数据自行决定哪些信息要保留,那些要忘掉。如此以来,自然语言的建模效果,就大大提高了。大家初看下面两张 RNN 与 LSTM 的结构对比,面对凭空多出来的几个门结构可能一头雾水,唯有洞彻其中的方法论,并且有扎实的机器学习和最优化基础,才能逐渐理解和学习这种思路。

    当然,LSTM 这个模型是神来之笔,我等对此可望不可及。不过,在这个例子里展现出来的关键能力:根据问题特点调整模型,并解决优化上的障碍,是一名合格的算法工程师应该追求的能力。年薪 50 万能找到这样的人,是物有所值的。

    第三层次"Objective":擅定义问题

    这个层次的工程师(哦,似乎叫工程师不太合适了),扔给他一个新的实际问题,可以给出量化的目标函数。

    当年,福特公司请人检修电机,斯坦门茨在电机外壳画了一条线,让工作人员在此处打开电机迅速排除了故障。结账时,斯坦门茨要 1 万美元,还开了个清单:画一条线,1 美元;知道在哪儿画线,9999 美元。

    同样的道理,在算法领域,最难的也是知道在哪里画线,这就是对一个新问题构建目标函数的过程。而有明确的量化目标函数,正是科学方法区别于玄学方法、神学方法的重要标志。

    目标函数,有时能用一个解析形式(Analytical form)写出来,有时则不能。比方说网页搜索这个问题,有两种目标函数:一种是 nDCG,这是一个在标注好的数据集上可以明确计算出来的指标;另一种则是人工看 badcase 的比例,显然这个没法用公式计算,但是其结果也是定量的,也可以作为目标函数。

    定义目标函数,初听起来并没有那么困难,不就是制定个 KPI 么?其实不然,要做好这件事,在意识和技术上都有很高的门槛。

    一、要建立“万般皆下品、唯有目标高”的意识。一个团队也好、一个项目也好,只要确立了正确的、可衡量的目标,那么达到这个目标就只是时间和成本的问题。假设 nDCG 是搜索的正确目标函数,那么微软也好、Yahoo!也好,迟早也能追上 Google,遗憾的是,nDCG 这个目标是有点儿问题的,所以后来这两家被越拉越远。

    所谓“本立而道生”:一个项目开始时,总是应该先做两件事:一是讨论定义清楚量化的目标函数;二是搭建一个能够对目标函数做线上A/B测试的实验框架。而收集什么数据、采用什么模型,倒都在其次了。

    二、能够构造准确(信)、可解(达)、优雅(雅)的目标函数。目标函数要尽可能反应实际业务目标,同时又有可行的优化方法。一般来说,优化目标与评测目标是有所不同的。比如说在语音识别中,评测目标是“词错误率”,但这个不可导所以没法直接优化;因此,我们还要找一个“代理目标”,比如似然值或者后验概率,用于求解模型参数。评测目标的定义往往比较直觉,但是要把它转化成一个高度相关,又便于求解的优化目标,是需要相当的经验与功力的。在语音建模里,即便是计算似然值,也需要涉及 Baum-Welch 等比较复杂的算法,要定义清楚不是简单的事儿。

    优雅,是个更高层次的要求;可是在遇到重大问题时,优雅却往往是不二法门。因为,往往只有漂亮的框架才更接近问题的本质。关于这点,必须要提一下近年来最让人醍醐灌顶的大作——生成对抗网络(GAN)。

    GAN 要解决的,是让机器根据数据学会画画、写文章等创作性问题。机器画画的目标函数怎么定?听起来是一头雾水。我们早年做类似的语音合成问题时,也没什么好办法,只能找人一句句听来打分。令人拍案叫绝的是,Ian GoodFellow 在定义这个问题时,采取了下图的巧妙框架:

    既然靠人打分费时费力,又不客观,那就干脆让机器打分把!好在让机器认一幅特定语义的图画,比如说人脸,在深度学习中已经基本解决了。好,假设我们已经有一个能打分的机器D,现在要训练一个能画画的机器G,那就让G不断地画,D不断地打分,什么时候G的作品在D那里得分高了,就算是学成了。同时,D在此过程中也因为大量接触仿品而提升了鉴赏能力,可以把G训练得更好。有了这样定性的思考还不够,这样一个巧妙设计的二人零和博弈过程,还可以表示成下面的数学问题:

    这样一个目标,优雅得象个哲学问题,却又实实在在可以追寻。当我看到这个式子时,顿时觉得教会机器画画是个不太远的时间问题了。如果你也能对这样的问题描述感到心旷神怡,就能体会为什么这才是最难的一步。

    一个团队的定海神针,就是能把问题转化成目标函数的那个人——哪怕他连开源工具都不会用。100 万找到这样的人,可真是捡了个大便宜。

    在机器学习领域,算法工程师脚下的进阶之路是清晰的:当你掌握了工具、会改造模型,进而可以驾驭新问题的建模,就能成长为最优秀的人才。沿着这条路踏踏实实走下去,100 万并不是什么问题。什么?您说还有 300 万的呢?这个不用眼热,人家只不过把你写代码的时间都用来跳槽了而已。


              Google’s TensorFlow Bet May Be Starting To Pay Off (GOOGL)        

    TensorFlow is Google’s open source software library that makes it easier for coders and developers to design, build, train and ...

    The post Google’s TensorFlow Bet May Be Starting To Pay Off (GOOGL) appeared first on Wall Street Pit.


              Comment on Apache SystemML by Microsoft lance Cognitive Toolkit 2, une trousse à outils open source pour l’intelligence artificielle | | Le Diligent        
    […] C’est pourquoi toutes les entreprises technologiques cherchent à séduire avec leurs propres solutions: AWS avec MXNet, Facebook avec Caffe2, Google avec TensorFlow, ou IBM avec SystemML. […]
               Announcing SyntaxNet: The World’s Most Accurate Parser Goes Open Source         
    https://github.com/tensorflow/models/tree/master/syntaxnet

    https://research.googleblog.com/2016/05/announcing-syntaxnet-worlds-most.html
              9 Ways to Get Help with Deep Learning in Keras        

    Keras is a Python deep learning library that can use the efficient Theano or TensorFlow symbolic math libraries as a backend. Keras is so easy to use that you can develop your first Multilayer Perceptron, Convolutional Neural Network, or LSTM Recurrent Neural Network in minutes. You may have technical questions when you get started using […]

    The post 9 Ways to Get Help with Deep Learning in Keras appeared first on Machine Learning Mastery.


              Artificial Intelligence / Machine Learning        
    OpenAI TensorFlow
              Link roundup #10        
    Another big backlog of links from the past few months. I need to get better at sending these in smaller digests.


              Realistic alternatives to Apple computers        
    I'm disappointed with the new MacBook Pros and I wrote my thoughts about them here. Since the announcement, I've been researching all of my options and weighing the pros and cons. What follows comes from my own assessment of 16 laptops, their features, and reviews I've read about them. I'll highlight the ones which I think are the top five alternatives to Apple's computers. At the end there is a grid of all the options and links to more info. The machines I'm evaluating are either for sale right now or will be shipping by the end of the year. I'm not holding out for any rumored products.

    These are the attributes that I think are important when choosing a new laptop:

    Must have:

    • 13" form-factor
    • Thunderbolt 3 ports
    • Headphone jack
    • Works decently with Linux

    Prefer:

    • HiDPI display (more than 200 pixels per inch)
    • 7th generation Core i7 CPU
    • 16 GB of RAM
    • USB-C ports

    Ambivalent:

    • Flip form-factor (aka "2-in-1")
    • USB 3.0 old-style A connectors
    • More than 6 hours of battery life

    Avoid:

    • Proprietary power plug (USB-C charging is better)
    • HDMI ports
    • SD card reader
    • Display port

    It's worth emphasizing how valuable Thunderbolt 3 is. With its 40Gbps transfer rate, "external GPU" enclosures have become a real thing and the options are increasing. In 2017, you should expect to dock your laptop into a gnarly GPU and use it for some intensive computation (VR, 3D design, neural network back propagation). Thunderbolt 3 also makes it easy to connect into one or more 4K+ external displays when you're not on the go. Not having Thunderbolt 3 significantly limits your future options.

    The other details to look for are Skylake (6th generation) vs. Kaby Lake (7th generation) processors, and Core i5/i7 vs. Core M processors. The differences are subtle but meaningful. All of the new MacBook Pros and the MacBook 12" have 6th generation CPUs. The MacBook Pros have i5/i7 chips. The 12" MacBooks have m3/m5/m7 chips. It's a bit odd that the latest and greatest from Apple includes chips that were released over a year ago.


    Here's my list of options, ordered by which ones I'm most seriously considering:


    1. HP Spectre x360

    Official product page and someone else's review that I found helpful.

    It doesn't have a HiDPI display, but everything else looks sleek and great. The previous year's model was also available in a 4K version, but that doesn't have any Thunderbolt 3 ports. If they do release a variation of the new one in 4K, that model would be the winner for me by every measure.



    Price$1,299
    ProsTwo Thunderbolt 3 ports. Charge via USB-C. 2-in-1 laptop.
    ConsHD Display.
    Thickness13.71mm
    Weight2.86lbs
    Battery57Wh
    Display1920 x 1080 (Touch)
    CPUIntel 7th Generation Core i7-7500U dual core
    RAM16GB
    Storage512GB Flash
    GraphicsIntel HD Graphics 620
    Power plugUSB-C
    Thunderbolt 3 ports2
    USB-C (non-Thunderbolt) ports0
    USB 3.0 A ports1
    SD slotsNone
    Video portsNone
    Audio portsHeadphone/mic jack


    2. Razer Blade Stealth 4K

    Official product page and someone else's review that I found helpful.

    With extra ports and a thick bezel it's not as svelte as I'd like. But the build quality seems high and I bet the 4K display looks awesome. Razer's Core external GPU is the easiest setup of its kind right now. There's also a cheaper option for $1,249 with less storage and a 2560 x 1440 screen (which is HiDPI like a MacBook but not close to 4K).



    Price$1,599
    Pros4K display. One Thunderbolt 3 port. Charge via USB-C.
    ConsNo USB-C ports besides the single Thunderbolt 3 one. Unnecessary video out. Big bezel around a small physical screen.
    Thickness13.1mm
    Weight2.84lbs
    Battery53.6Wh
    Display3840 x 2160 (Touch)
    CPUIntel 7th Generation Core i7-7500U dual core
    RAM16GB
    Storage512GB Flash
    GraphicsIntel HD Graphics 620
    Power plugUSB-C
    Thunderbolt 3 ports1
    USB-C (non-Thunderbolt) ports0
    USB 3.0 A ports2
    SD slotsNone
    Video portsHDMI
    Audio portsHeadphone/mic jack


    3. Dell XPS 13

    Official product page and someone else's review that I found helpful.

    This laptop has a modern edge-to-edge screen, but it's not quite 4K. I wouldn't look forward to lugging around the Dell-specific power cable (and being screwed when I lose it). Update: Blaine Cook corrected me in the comments: It turns out that it can charge via USB-C in addition to the proprietary power plug. Hooray! — Its ports, slots, and camera are a bit quirky. But, strongly in favor, it's also the laptop that Linus uses! There's a cheaper version with less storage and a slower i5 CPU for $1,399.



    Price$1,849
    ProsOne Thunderbolt 3 / USB-C port. Nearly 4K display.
    ConsExpensive. Unnecessary SD card slot. Proprietary power plug. Webcam is in a weird location.
    Thickness9-15mm
    Weight2.9lbs
    Battery60Wh
    Display3200 x 1800 (Touch)
    CPUIntel 7th Generation Core i7-7500U dual core
    RAM16GB
    Storage512GB Flash
    GraphicsIntel HD Graphics (unspecified version)
    Power plugProprietary
    Thunderbolt 3 ports1
    USB-C (non-Thunderbolt) ports0
    USB 3.0 A ports2
    SD slotsSD slot
    Video portsNone
    Audio portsHeadphone/mic jack


    4. HP EliteBook Folio G1

    Official product page and someone else's review that I found helpful.

    This machine is tiny, fanless, and looks like a MacBook Air at first glance. It has Thunderbolt 3 and none of the old ports weighting it down. And 4K! The biggest drawback is that the CPU is a 6th generation Core M processor instead of an i5 or i7. If the 12" MacBook is more your speed than the MacBook Pro, then this could be the right machine for you.



    Price$1,799
    ProsCharge via USB-C. Two Thunderbolt 3 ports. 4K display.
    ConsExpensive. Underpowered 6th-generation M CPU. Max 8GB of RAM.
    Thickness11.93mm
    Weight2.14lbs
    Battery38Wh
    Display3840 x 2160
    CPUIntel 6th Generation m7-6Y75 dual core
    RAM8GB
    Storage256GB Flash
    GraphicsIntel HD Graphics 515
    Power plugUSB-C
    Thunderbolt 3 ports2
    USB-C (non-Thunderbolt) ports0
    USB 3.0 A ports0
    SD slotsNone
    Video portsNone
    Audio portsHeadphone/mic jack


    5. Lenovo Yoga 910

    Official product page and someone else's review that I found helpful.

    If this had a Thunderbolt 3 port, I think it would be the laptop to get. It has a 4K screen and the styling looks great. Unfortunately, instead of Thunderbolt 3, Lenovo included a USB-C port that only speaks USB 2.0 protocol (not a typo, it's version two) and is used for charging. There's a cheaper option with less storage and RAM for $1,429.



    Price$1,799
    ProsTwo USB-C ports. Charge via USB-C. 4K display. 2-in-1 laptop.
    ConsNo Thunderbolt 3 ports. Small battery. Expensive. One of the USB-C ports is a USB 2.0 port.
    Thickness14.3mm
    Weight3.04lbs
    Battery48Wh
    Display3840 x 2160 (Touch)
    CPUIntel 7th Generation i7-7500U dual core
    RAM16GB
    Storage1TB Flash
    GraphicsIntel HD Graphics 620
    Power plugUSB-C
    Thunderbolt 3 ports0
    USB-C (non-Thunderbolt) portsone 3.0 port, one 2.0 port
    USB 3.0 A ports1
    SD slotsNone
    Video portsNone
    Audio portsHeadphone/Microphone combined jack


    Conclusion

    I'm still not sure which computer I'm going to get. I'm now looking through Linux distributions like Ubuntu and elementary OS to see what compatibility and usability are like. I doubt that 2017 will be the "year of the Linux laptop", but for the first time I'm willing to give it an honest try.

    Make no mistake: I think that Apple computers are still gorgeous and a great choice for people who have the budget. I plan to continue recommending MacBooks to family members, friends, acquaintances, and all of the other non-technical people in my life. I think "it just works" is still true for the low-end, and that's ideal for consumers. But consumers have very different needs than professionals.

    For a long time, Apple has been a lofty brand, the "insanely great" hardware that people bought because they aspired to "think different". It's looking like that era may be over. Apple may have completed their transition into a mass-market company that makes relatively high quality hardware for normal people. There's nothing wrong with that. But it's probably not for me.


    Here's the full list of the computers I considered, in the order I ranked them:

    ModelProsConsPrice
    HP Spectre x360Two Thunderbolt 3 ports. Charge via USB-C. 2-in-1 laptop.HD Display.$1,299
    Razer Blade Stealth 4K4K display. One Thunderbolt 3 port. Charge via USB-C.No USB-C ports besides the one Thunderbolt 3 one. Unnecessary video out. Big bezel.$1,599
    Razer Blade Stealth QHDHiDPI display. One Thunderbolt 3 port. Charge via USB-C.No USB-C ports besides the one Thunderbolt 3 one. Unnecessary video out. Big bezel.$1,249
    Apple MacBook Pro 13" with upgradesTwo Thunderbolt 3 ports. Charge via USB-C. HiDPI display. Good video card.6th generation CPU. Expensive.$1,999
    Apple MacBook Pro 13"Two Thunderbolt 3 ports. Charge via USB-C. HiDPI display. Good video card.Underpowered i5 CPU. 6th generation CPU. Expensive.$1,499
    Dell XPS 13 with upgradesOne Thunderbolt 3 / USB-C port. Nearly 4K display.Expensive. Unnecessary SD card slot. No USB-C ports. Proprietary power plug.$1,849
    Dell XPS 13One Thunderbolt 3 / USB-C port. Nearly 4K display.Underpowered i5 CPU. Unnecessary SD card slot. No USB-C ports. Proprietary power plug.$1,399
    HP EliteBook Folio G1 Notebook PCCharge via USB-C. Two Thunderbolt 3 ports. 4K display.Expensive. Underpowered 6th-generation M CPU. Max 8GB of RAM.$1,799
    Lenovo Yoga 910 with upgradesTwo USB-C ports. Charge via USB-C. 4K display. 2-in-1 laptop.No Thunderbolt 3 ports. Small battery. Expensive. One of the USB-C ports is a USB 2.0 port.$1,799
    Lenovo Yoga 910Two USB-C ports. Charge via USB-C. 4K display. 2-in-1 laptop.No Thunderbolt 3 ports. Small battery. One of the USB-C ports is a USB 2.0 port. Only 8GB of RAM.$1,429
    Apple 12" MacBookCharge via USB-C. HiDPI display.6th generation CPU. Poor webcam. Only one USB-C port. No Thunderbolt 3 ports. Expensive. Only 8GB of RAM.$1,749
    Asus ZenBook UX306 13"USB-C port. Nearly 4K display.Only one USB-C port. No Thunderbolt 3 ports. Proprietary power plug. Unnecessary video out ports. 6th generation CPU.Goes on sale any day now
    Acer Swift 7Charge via USB-C. Two USB-C ports.No Thunderbolt 3 ports. HD display. Underpowered i5 CPU. Small battery.$1,099
    HP Spectre 13Two Thunderbolt 3 ports. Charge via USB-C.HD Display. Only 8GB of RAM available. Small battery.$1,249
    Asus ZenBook 3 UX390UACharging via USB-C. Very small.HD display. Only one USB-C port. No Thunderbolt 3 ports. Expensive. Small battery.$1,599
    Asus ZenBook Flip UX360CAOne USB-C port. 2-in-1 laptop.Underpowered m3 CPU. 6th generation CPU. HD Display. No Thunderbolt 3 ports. Proprietary power plug. Unnecessary SD slot. Unnecessary video out port.$749
    Microsoft Surface BookNearly 4K display. Surface pen included. 2-in-1 laptop.No USB-C ports. No Thunderbolt 3 ports. Unnecessary video ports. Unnecessary SD slot. Expensive. Underpowered i5 CPU. 6th generation CPU.$1,499
    Lenovo ThinkPad X1 Carbon 4th Generation 14"HiDPI display.No USB-C ports. No Thunderbolt 3 ports. Too many video out ports. Unnecessary SD slot. 6th generation CPU.$1,548
    Samsung Notebook 9 spinNearly 4K display. 2-in-1 laptop.Unnecessary SD slot. Unnecessary video out port. No USB-C ports. No Thunderbolt 3 ports. Small battery. 6th generation CPU. 8GB maximum RAM.$1,199

              Link roundup #6        
    "With two inputs, a neuron can classify the data points in two-dimensional space into two kinds with a straight line. If you have three inputs, a neuron can classify data points in three-dimensional space into two parts with a flat plane, and so on. This is called 'dividing n-dimensional space with a hyperplane.'"
    — Understanding neural networks with TensorFlow Playground

    "We were looking to make usage of Kafka and Python together just as fast as using Kafka from a JVM language. That’s what led us to develop the pykafka.rdkafka module. This is a Python C extension module that wraps the highly performant librdkafka client library written by Magnus Edenhill."
    — PyKafka: Fast, Pythonic Kafka, at Last!

    "To satisfy this claim, we need to see a complete set of statically checkable rules and a plausible argument that a program adhering to these rules cannot exhibit memory safety bugs. Notably, languages that offer memory safety are not just claiming you can write safe programs in the language, nor that there is a static checker that finds most memory safety bugs; they are claiming that code written in that language (or the safe subset thereof) cannot exhibit memory safety bugs."
    — "Safe C++ Subset" Is Vapourware

    "For each mutant the tool runs the unit test suite; and if that suite fails, the mutant is said to have been killed. That's a good thing. If, on the other hand, a mutant passes the test suite, it is said to have survived. This is a bad thing."
    — Mutation Testing

    "This meant that literally everything was asynchronous: all file and network IO, all message passing, and any “synchronization” activities like rendezvousing with other asynchronous work. The resulting system was highly concurrent, responsive to user input, and scaled like the dickens. But as you can imagine, it also came with some fascinating challenges."
    — Asynchronous Everything
              Link roundup #1        
    "You, me, every software developer out there, right now, is creating legacy software."
    — All Software is Legacy

    "... Einstein: an expansive human consciousness that could form a concept so far beyond the experimental capabilities of his day that inventing the tools to prove its validity took a hundred years."
    — Letter regarding the first direct detection of gravitational waves

    "Services should only log actionable information."
    "Services should instrument every meaningful number available for capture."

    — Logging v. instrumentation

    "sketch-rnn was able to generate a variety of Kanji that do not exist, but resemble somewhat the way Kanji are supposed to be written."
    — Recurrent Net Dreams Up Fake Chinese Characters in Vector Format with TensorFlow

    "What makes Go successful?"
    "My answer: Simplicity."
    — Simplicity is Complicated

    "I'm going to explain just why Swift's String API is ... the best string API out there in terms of its fundamental design."
    — Why is Swift's String API So Hard?
              How to analyze tweet sentiments with PHP Machine Learning        

    In a post on Sitepoint Allan MacGregor gives a good practical example on how to work with PHP-ML, a machine learning library for PHP. As of late, it seems everyone and their proverbial grandma is talking about Machine Learning. Your social media feeds are inundated with posts about ML, Python, TensorFlow, Spark, Scala, Go and […]

    The post How to analyze tweet sentiments with PHP Machine Learning appeared first on murze.be.


              Fixing Site SEO With One Google Data Studio Report        

    On any given day, hundreds of blog posts will tell you what the latest and greatest in SEO advice is. From accelerated mobile pages to Tensorflow-powered topic models, there are new innovations in SEO all the time. Fix the Basics First However, all the newfangled innovations in SEO are rendered largely ineffective when the basics […]

    The post Fixing Site SEO With One Google Data Studio Report appeared first on Christopher S. Penn Marketing Blog.


              ã€Python】Tensorflowでresizeした画像をmatplotlibで表示したい        
    >>> import matplotlib
    >>> import matplotlib.pyplot as plt
    >>> import tensorflow as tf
    >>> tf.__version__
    '1.1.0'
    >>> matplotlib.__version__
    '2.0.0'
    

    Tensorflowの画像前処理関数って結構豊富っぽくて使いたいって練習しようとしてた。

    tf.image.resize_images  |  TensorFlow

    単純にresizeしたいと思って↑の関数を使う。

    # image -> opencvとかで読み込んだ画像
    >>> tf_image = tf.image.resize_images(image, [100, 100])
    >>> session = tf.Session()
    >>> with session.as_default():
    ...     output = tf_image.eval()
    ...
    >>> plt.imshow(output)
    >>> plt.show()
    

    そしたら、反転?みたいな画像がでてきて困惑。

    なーぜー?

    stackoverflow.com

    matplotlibでtensorを表示するためには、tf.float32でcastしたあとに255.0で割ってあげればおk

    >>> image = tf.cast(image, tf.float32) / 255.0
    >>> tf_image = tf.image.resize_images(tf_image, [100, 100])
    >>> session = tf.Session()
    >>> with session.as_default():
    ...     output = tf_image.eval()
    ...
    >>> plt.imshow(output)
    >>> plt.show()
    

    これで閲覧できた。

    そもそもmatplotlibじゃなくて、tensorboardでみればいいのでは?

    ・・・・。(試してないからわからんけど)


              The Android Things Developer Preview 2 is out, adds support for Intel's Joule, brings TensorFlow for machine learning on IoT platforms, and more        

    It's been a big day from the mystical Google land. In addition to all of the Wear stuff, the team behind Android Things has released the second Developer Preview for supported Internet-of-Things platforms. It brings some new features and a few bug fixes, as well as support for the Intel Joule.

    Android Things, formerly known as Brillo, is Google's solution to a comprehensive, friendly IoT foundation upon which to build awesome products.

    Read More

    The Android Things Developer Preview 2 is out, adds support for Intel's Joule, brings TensorFlow for machine learning on IoT platforms, and more was written by the awesome team at Android Police.


              How AI can help make safer baby food (and other products)        

    Editor’s note: Whether you’re growing cucumbers or building your own robot arm, machine learning can help. In this guest editorial, Takeshi Ogino of Kewpie tells us how they used machine learning to ensure the quality and safety of the ingredients that go into their food products.

    Quality control is a challenge for most industries, but in the world of food production, it’s one of the biggest. With food, products are as good as the ingredients that go into them. Raw materials can vary dramatically, from produce box to produce box, or even from apple to apple. This means inspecting and sorting the good ingredients from the bad is one of the most important tasks any food company does. But all that work inspecting by hand can be time-consuming and arduous both in terms of overhead and manpower. So what’s a food company to do?

    At Kewpie Corporation, we turned to a surprising place to explore better ways to ensure food quality: artificial intelligence built on TensorFlow.

    Although Kewpie Corporation is most famous for our namesake mayonnaise, we’ve been around for 100 years with dozens of products, from dressings to condiments to baby foods. We’ve always believed that good products begin with good ingredients.

    kewpie-1

    Ingredients that are safe and also give you peace of mind

    Last October, we began investigating whether AI and machine learning could ensure the safety and purity of our ingredients faster and more reliably than ever.

    The project began with a simple question: “What does it mean to be a ‘good’ ingredient?” The ingredients we purchase must be safe, of course, and from trustworthy producers. But we didn’t think that went far enough. Ingredients also need to offer peace of mind. For example, the color of potatoes can vary in ways that have nothing to do with safety or freshness.

    Kewpie depends on manual visual detection and inspection of our raw ingredients. We inspect the entire volume of ingredients used each day, which, at four to five tons, is a considerable workload. The inspection process requires a certain level of mastery, so scaling this process is not easy. At times we’ve been bottlenecked by inspections, and we’ve struggled to boost production when needed.

    We’d investigated the potential for mechanizing the process a number of times in the past. However, the standard technology available to us, machine vision, was not practical in terms of precision or cost. Using machine vision meant setting sorting definitions for every ingredient. At the Tosu Plant alone we handle more than 400 types of ingredients, and across the company we handle thousands.

    That’s when I began to wonder whether using machine learning might solve our problem.

    Using unsupervised machine learning to detect defective ingredients

    We researched AI and machine learning technology across dozens of companies, including some dedicated research organizations. In the end, we decided to go with TensorFlow. We were impressed with its capabilities as well as the strength of its ecosystem, which is global and open. Algorithms that are announced in papers get implemented quickly, and there’s a low threshold for trying out new approaches.

    One great thing about TensorFlow is that it has such a broad developer community. Through Google, we connected with our development partner, BrainPad Inc, who impressed us with their ability to deliver production level solutions with the latest deep learning. But even BrainPad, who had developed a number of systems to detect defective products in manufacturing processes, had never encountered a company with stricter inspection standards than ours. Furthermore, because our inspections are carried out on conveyor belts, they had to be extremely accurate at high speeds. Achieving that balance between precision and speed was a challenge BrainPad looked forward to tackling.

    kewpie-2
    Sorting diced potato pieces at the Tosu Plant.

    To kick off the project, we started with one of our most difficult inspection targets: diced potatoes. Because they’re an ingredient in baby food, diced potatoes are subject to the strictest scrutiny both in terms of safety and peace of mind. That meant feeding more than 18,000 line photographs into TensorFlow so that the AI could thoroughly learn the threshold between acceptable and defective ingredients.

    Our big breakthrough came when we decided to use the AI not as a ”sorter” but an ”anomaly detector.” Designing the AI as a sorter meant supervised learning, a machine learning model that requires labels for each instance in order to accurately train the model. In this case that meant feeding into TensorFlow an enormous volume of data on both acceptable and defective ingredients. But it was hugely challenging for us to collect enough defective sample data. But by training the system to be an anomaly detector we could employ unsupervised learning. That meant we only needed to feed it data on good ingredients. The system was then able to learn how to identify acceptable ingredients, and reject as defective any ingredients that failed to match. With this approach, we achieved both the precision and speed we wanted, with fewer defective samples overall.

    By early April, we were able to test a prototype at the Tosu Plant. There, we ran ingredients through the conveyor belt and had the AI identify which ones were defective. We had great results. The AI picked out defective ingredients with near-perfect accuracy, which was hugely exciting to our staff.

    kewpie-3
    The inspection team at the Tosu Plant.

    It’s important to note that our goal has always been to use AI to help our plant staff, not replace them. The AI-enabled inspection system performs a rough removal of defective ingredients, then our trained staff inspects that work to ensure nothing slips through. That way we get “good” ingredients faster than ever and are able to process more food and boost production.

    Today we may only be working with diced potatoes, but we can’t wait to expand to more ingredients like eggs, grains and so many others. If all goes well, we hope to offer our inspection system to other manufacturers who might benefit. Existing inspection systems such as machine vision have not been universally adopted in our industry because they're expensive and require considerable space. So there’s no question that the need for AI-enabled inspection systems is critical. We hope, through machine learning, we’re bringing even more safe and reassuring products to more people around the world.


              Nutanix and Google Cloud team up to simplify hybrid cloud        

    Today, we’re announcing a strategic partnership with Nutanix to help remove friction from hybrid cloud deployments for enterprises. We often hear from our customers that they’re looking for solutions to deploy workloads on premises and in the public cloud.

    Benefits of a hybrid cloud approach include the ability to run applications and services, either as connected or disconnected, across clouds. Many customers are adopting hybrid cloud strategies so that their developer teams can release software quickly and target the best cloud environment for their application. However, applications that span both infrastructures can introduce challenges. Examples include difficulty migrating workloads such as dev-testing that need portability and managing across different virtualization and infrastructure environments.

    Instead of taking a single approach to these challenges, we prefer to collaborate with partners and meet customers where they are. We're working with Nutanix on several initiatives, including:

    • Easing hybrid operations by automating provisioning and lifecycle management of applications across Nutanix and Google Cloud Platform (GCP) using the Nutanix Calm solution. This provides a single control plane to enable workload management across a hybrid cloud environment.

    • Bringing Nutanix Xi Cloud Services to GCP. This new hybrid cloud offering will let enterprise customers leverage services such as Disaster Recovery to effortlessly extend their on-premise datacenter environments into the cloud.

    • Enabling Nutanix Enterprise Cloud OS support for hybrid Kubernetes environments running Google Container Engine in the cloud and a Kubernetes cluster on Nutanix on-premises. Through this, customers will be able to deploy portable application blueprints that target both an on-premises Nutanix footprint as well as GCP.

    In addition, we’re also collaborating on IoT edge computing use-cases. For example, customers training TensorFlow machine learning models in the cloud can run them on the edge on Nutanix and analyze the processed data on GCP.

    We’re excited about this partnership as it addresses some of the key challenges faced by enterprises running hybrid clouds. Both Google and Nutanix are looking forward to making our products work together and to the experience we'll deliver together for our customers.


              Showing What Algorithmic Influence On Markets Leaves Out        

    I’ve been playing with different ways of visualizing the impact that algorithms are making on our lives. How they are being used to distort the immigration debate, and how the current administration is being influenced and p0wned by Russian propaganda. I find shedding light on how algorithms are directly influencing a variety of conversations using machine learning a fun pastime. I’m also interested in finding ways to shine a light on what gets filtered out, omitted, censored, or completely forgotten by algorithms, and their authors.

    One of my latest filters I’ve trained using TensorFlow is called “Feed the People”. It is an early 20th century Soviet propaganda poster that I do not know much history behind, but I feel provides a compelling point, while also providing an attractive and usable color palette and textures–I will have to do more research on the back story. I took this propaganda poster and trained a TensorFlow machine learning model for about 24 hours on an AWS EC2 GPU instance, which cost me about $18.00 for the entire process–leaving me with a ML model I can apply to any image.

    Once I had my trained machine learning model I applied to a handful of images, including one I took of the economist Adam Smith statue in Edinburgh, Scotland–which interestingly was commissioned by the Adam Smith Institute (ASI), a neoliberal (formerly libertarian) think tank and lobbying group based in the United Kingdom, named after Adam Smith, a Scottish moral philosopher and classical economist in 2003. Taking the essence of the “feed the people” propaganda and algorithmically transferring it an image of the famous economist from the 18th century that was installed on the city streets by a neoliberal think tank in 2003.

    I’m super fascinated by how algorithms influence markets, from high speed trading, all the way to how stories about markets are spread on Facebook by investors, and libertarian and neoliberal influencers. Algorithms are being used to distort, contort, p0wn, influence and create new markets. I am continuing to trying to understand how propaganda and ideology is influencing these algorithms, but more importantly highlighting the conversations, and people that are ultimately left behind in the cracks as algorithms continue to consume our digital and physical worlds, and disrupt everything along the way.


              The Imperative to Democratize Artificial Intelligence         
    November 30, 2016 (LocalOrg) - MIT Technology Review recently published an article titled, "An AI Ophthalmologist Shows How Machine Learning May Transform Medicine." In it, it describes how Google researchers at their DeepMind subsidiary used artificial intelligence (AI) to scan images of human eyes to detect a common form of blindness as well as, or better than trained experts can.


    They achieved this by using the same machine learning techniques Google and other tech giants including Facebook use to analyze images that show up on their web platforms. Instead of creating complex programs to handle every conceivable detail in an image, researchers instead teach machines how to learn on their own when exposed to large volumes of pre-tagged examples.

    In the MIT Technology Review article, DeepMind's algorithm studied some 128,000 retinal images that were already classified by ophthalmologists.

    The breakthrough is only the latest in a long line of advances in AI. AI machine learning is already being widely used in real-world applications, including sifting through the United Kingdom's National Health Service's records, automatically tagging - and flagging - images, videos, and voice across vast social networks, improving efficiency at utility plants by spotting trends and automatically adjusting power consumption, inputs, and outputs, as well as developing protocols for both pharmaceutical production and genetic engineering.



    DeepMind's research into analyzing medical imagery is already set to be integrated into its UK NHS collaboration, according to the Guardian in an article titled, "Google DeepMind pairs with NHS to use machine learning to fight blindness," which reports:
    Google DeepMind has announced its second collaboration with the NHS, working with Moorfields Eye Hospital in east London to build a machine learning system which will eventually be able to recognise sight-threatening conditions from just a digital scan of the eye. 

    The collaboration is the second between the NHS and DeepMind, which is the artificial intelligence research arm of Google, but Deepmind’s co-founder, Mustafa Suleyman, says this is the first time the company is embarking purely on medical research. An earlier, ongoing, collaboration, with the Royal Free hospital in north London, is focused on direct patient care, using a smartphone app called Streams to monitor kidney function of patients.
    In essence, those who control AI technology have access to algorithms that can perform specific tasks better than any trained human can. This confers on those who control this technology an immense advantage and creates disparity those without AI technology have no means of competing against.


    Corporations and nations wielding this power, as the number of applications expands, represent an alarming, emerging disparity that may lead to the same sort of abuses and exploitation other forms of technological disparity throughout history have wrought.

    Democratizing AI 

    Effort into developing AI applications involves big-data. Training machines rather than merely programming them, means exposing them to large amounts of information they can sift through and train themselves with. In order to do this, not only do large amounts of information need to be collected, they need to be tagged or otherwise classified so machines have a baseline to improve against.

    Image: A deep learning developer box, via CADnetwork.
    The development of these large data sets, as well as developing algorithms to exploit them, requires (at the moment) large numbers of participants outside of corporations like Google and their subsidiaries like DeepMind.

    Toward that end, opensource software libraries for machine learning, like Google's TensorFlow are available online for free. GitHub, an online development repository, offers access to a wide range of other available machine learning libraries coders and programmers can use.

    The physical hardware currently being used to build deep learning machines include GPUs (Graphics Processing Units) similar to those found in high-end gaming computers. Instructions are online on how to build deep learning machines, including information provided by companies like NVIDIA which make commercially available GPUs.

    While it remains to be seen what individual or independent groups of developers can achieve in terms of democratizing this technology, it may be in the best interests of nation-states to begin developing their own AI programs rather than wait for Google, Facebook, and even China's Baidu to "share" this technology with them.

    It may also be in their best interests to examine the merits of promoting the democratization of this technology. Where a lack of resources to acquire high-level researchers at an institutional level exists, democratizing and thus tapping a larger pool of talent to even the odds in the AI race while also raising public literacy regarding this increasingly pivotal technology may be an alternative option.

    Research into AI cannot be "banned" and breakthroughs cannot be "un-invented." With the tools already widely (and in some cases, freely) available to advance AI, attempts to put this civilization-changing technology "back in the box" will only waste time and resources. The only way to counter the harmful application of AI is by possessing an equal or greater capacity to utilize the technology and increase the number of people both educated in how it works, and capable of applying it in reaction to harmful exploitation of it.

    Just like information technology, nuclear weapons, or even firearms tilted the global balance of power in favor of those who initially wielded them before more acquired and exploited these technologies, AI too poses a threat unless and until it is more widely adopted and democratized.

    With the power to focus on and master any task at superhuman levels, we ignore the challenge to balance this emerging power at our own peril.

    LocalOrg seeks to explore local solutions to global problems by empowering people locally with education and technology to not only survive, but to thrive.
     

              Self coloring books        
    Machine learning is eating software. Here at comma.ai we want to build the best machine learning. This makes us all work really hard and sometimes need some stress relief. Our art therapist suggested us to try adult coloring books to relax. They worked so well for us that we decided to share the love with the world and built commacoloring, comma.ai adult coloring books .

    commacoloring was really well received and made it to the front page of Product Hunt. We got a lot of feedback from our users (we love users!). A feature was requested to automatically color the easy parts of the image, letting the user focus in the details. We used our self-driving car engineering skills to build a self-coloring book.

    We call this new feature Suggestions. You can try right now by clicking the "suggest" button!

    The engineering

    Note: you can skip that section without affecting your coloring experience, but if you are familiar with deep learning jargon, please read along.

    To automate the coloring process we trained a deep neural network for pixel level semantic parsing, i.e a network that will classify (color) each pixel using information of its surroundings. Given the state of the art, we knew the right approach would be a fully convolutional neural network. We started by trying an encoder-decoder like architecture with 4 convolutions down and 4 deconvolutions up [1], with one output channel per class. This was taking too long to converge though.

    We later noticed that [2] claims that retraining the encoder network is not really necessary. They used a pre-trained VGG for dense classification in low resolution and bilinear interpolation followed by Conditional Random Fields for upscaling the image back to its desired size. Also [3] stated that the job of the decoder/deconvolution network is to mainly upscale and smooth the segmented output image and it can be a smaller network. Reddit brought our attention to ReSeg [4] that uses only the convolutional layers of VGG as the encoder.

    Our final solution combined ideas from [3] and [4] and used fixed VGG convolutional layers as the encoder and trained a simple deconvolutional network as the decoder. Each layer of our decoder used only 16 filters of 5x5 pixels with upscaling stride of 2. We tried faster upscaling with stride 4 but the results didn't look sharp enough.

    In one of our experiments we reinitilized the VGG weights to random values and were still able to learn a successful decoder. We called this architecture Extreme Segmentation Network, since it resembles Extreme Learning Machines. Unfortunately, we were aware that the acronym would compete with Echo-State Networks' and we decided to use the original VGG filters in production. Our final network is called Suggestions Network (SugNet). Some results are shown in Figure 1 and 2.


    Figure 1. Input image and self colored Suggestions example.


    Figure 2. Sample outputs of the segmentation network after 400 training epochs compared to human colored images.

    All our method was implemented with Keras using Tensorflow backend. The VGG image preprocessing used Theano backend. At test time, using Tensorflow only the results didn't match and we doubted our engineering skills for a while before remembering that Theano implements correlation instead of convolution. Here is how to convert convolutional wieghts from Theano to Tensorflow. Keras didn't have a proper deconvolution layer, but we started working on a PR for that.

    References:  
    [1] Vijay Badrinarayanan, Ankur Handa and Roberto Cipolla "SegNet: A Deep Convolutional Encoder-Decoder Architecture for Robust Semantic Pixel-Wise Labelling". arXiv:1505.07293   
    [2] Liang-Chieh Chen, George Papandreou, Iasonas Kokkinos, Kevin Murphy, Alan L. Yuille "Semantic Image Segmentation with Deep Convolutional Nets and Fully Connected CRFs". arXiv:1412.7062  
    [3] Adam Paszke, Abhishek Chaurasia, Sangpil Kim, Eugenio Culurciello "ENet: A Deep Neural Network Architecture for Real-Time Semantic Segmentation". arXiv:1606.02147.
    [4] Francesco Visin, Marco Ciccone, Adriana Romero, Kyle Kastner, Kyunghyun Cho, Yoshua Bengio, Matteo Matteucci, Aaron Courville "ReSeg: A Recurrent Neural Network-based Model for Semantic Segmentation". arXiv:1511.07053.

    We hope that Suggestions will inspire you to build even more fun apps with the open source commacoloring product. Let us know about all the amazing things you build with it.

              TensorFlow: Google rilascia il suo software di intelligenza artificiale al mondo Open Source        

    Google rilascia al mondo Open Source la sua piattaforma di Learning Machine (apprendimento automatico), che è al centro dell'intelligenza artificiale grazie alla quale le App sugli smartphone potranno presto svolgere funzioni fino ad oggi impossibili.

    Autor: avatarbyoblu
    Tags: Google Artificial intelligence Learning Machine Tensorflow Apprendimento automatico Intelligenza artificiale
    gepostet: 10 November 2015


              Ensemble Machine Learning in Python: Random Forest, AdaBoost - Udemy , Online         

    In recent years, we've seen a resurgence inAI, orartificial intelligence, andmachine learning.

    Machine learning has led to some amazing results, like being able to analyze medical images and predict diseases on-par with human experts.

    Google's AlphaGo program was able to beat a world champion in the strategy game go using deep reinforcement learning.

    Machine learning is even being used to program self driving cars, which is going to change the automotive industry forever. Imagine a world with drastically reduced car accidents, simply by removing the element of human error.

    Google famously announced that they are now "machine learning first", and companies like NVIDIAand Amazonhave followed suit, and this is what's going to drive innovation in the coming years.

    Machine learning is embedded into all sorts of different products, and it'sused in many industries, like finance, online advertising, medicine, and robotics.

    It is a widely applicable tool that will benefit you no matter what industry you're in, and it will also open up a ton of career opportunities once you get good.

    Machine learning also raises some philosophical questions. Are we building a machine that can think? What does it mean to be conscious? Will computers one day take over the world?

    This course is all about ensemble methods.

    We've already learned some classic machine learning models like k-nearest neighborand decision tree. We've studied their limitations and drawbacks.

    But what if we could combine these models to eliminate those limitations and produce a much more powerful classifier or regressor?

    In this course you'll study ways to combine models like decision trees and logistic regression to build models that can reach much higher accuracies than the basemodels they are made of.

    In particular, we will study the Random Forest and AdaBoost algorithms in detail.

    To motivate our discussion, we will learn about an importanttopic in statistical learning, the bias-variance trade-off. We will then study the bootstrap technique and bagging as methods for reducing both bias and variance simultaneously.

    We'll do plenty of experiments and use these algorithms on real datasets so you can see first-hand how powerful they are.

    Since deep learning is so popular these days, we will study some interesting commonalities between random forests, AdaBoost, and deep learning neural networks.

    All the materials for this course are FREE. You can download and installPython, Numpy, andScipywith simple commands onWindows, Linux, or Mac.

    This course focuses on "how to build and understand", not just "how to use". Anyone can learn to use an API in 15 minutes after reading some documentation. It's not about "remembering facts", it's about"seeing for yourself" via experimentation. It will teach you how to visualize what's happening in the model internally. If you wantmorethan just a superficial look at machine learning models, this course is for you.


    NOTES:

    All the code for this course can be downloaded from my github: /lazyprogrammer/machine_learning_examples

    In the directory: supervised_class2

    Make sure you always "git pull" so you have the latest version!

    TIPS (for getting through the course):

    • Watch it at 2x.
    • Take handwritten notes. This will drastically increase your ability to retain the information.
      • Write down the equations. If you don't, I guarantee it will just look like gibberish.
    • Ask lots of questions on the discussion board. The more the better!
    • Realize that most exercises will take you days or weeks to complete.

    USEFULCOURSEORDERING:

    • (The Numpy Stack in Python)
    • Linear Regression in Python
    • Logistic Regression in Python
    • (Supervised Machine Learning in Python)
    • (Bayesian Machine Learning in Python:A/BTesting)
    • Deep Learning in Python
    • Practical Deep Learning in Theano and TensorFlow
    • (Supervised Machine Learning in Python 2:Ensemble Methods)
    • Convolutional Neural Networks in Python
    • (Easy NLP)
    • (Cluster Analysis and Unsupervised Machine Learning)
    • Unsupervised Deep Learning
    • (Hidden Markov Models)
    • Recurrent Neural Networks in Python
    • Natural Language Processing withDeep Learning in Python

    Cost: 120 USD


              é™ˆæ±ä¸¹: Caffe、TensorFlow、MXnet三个开源库对比        
    http://chenrudan.github.io/blog/ ... 47a7b7e80a40613cfe1 陈汝丹 Caffe、TensorFlow、MXnet三个开源库对比2015年11月18日 project experience最近Google开源了他们内部使用的深度学习框架TensorFlow[1],结合之前开源的MXNet[2]和Caffe[3],对三个开源库做了一些 ...
              åä¸ªå€¼å¾—一试的开源深度学习框架        
    http://cloud.idcquan.com/yjs/80418.shtml 十个值得一试的开源深度学习框架2015-11-20 11:09  开源中国社区   早些时候Google开源了TensorFlow(GitHub),此举在深度学习领域影响巨大,因为Google在人工智能领域的研发成绩斐然,有着雄厚的人才储备,而且Google自 ...
              è°·æ­Œç¬¬äºŒä»£æ·±åº¦å­¦ä¹ ç³»ç»ŸTensorFlow首次解密        
    PPT下载: 谷歌第二代深度学习系统TensorFlow首次解密(70PDF下载) 2015-11-05 新智元 新智元 新智元 1新智元推荐1 PDF来自微博@王威廉 Google资深系统专家Jeff Dean在最近的湾区机器学习大会做了 Large-Scale Deep Learning for Inelligent Computer Systems 的 ...
              [آموزش] دانلود Udemy Python for Data Science and Machine Learning Bootcamp – آموزش پایتون برای علوم داده و یادگیری ماشین        

    دانلود Udemy Python for Data Science and Machine Learning Bootcamp - آموزش پایتون برای علوم داده و یادگیری ماشین

    علم داده‌ ها (Data Science)، مطالعاتی پیرامون استخراج دانش و آگاهی از مجموعه‌ای داده و اطلاعات است. هدف این علم، استخراج مفهوم از داده و تولید محصولات داده‌ محور است. به شاغلین در حوزه ی علم داده، داده پژوه (data scientist) می گویند. یکی از شاخه‌های وسیع و پرکاربرد هوش مصنوعی، یادگیری ماشینی (Machine learning) است که به تنظیم و اکتشاف شیوه‌ها و الگوریتم‌هایی می‌پردازد که بر اساس آنها رایانه‌ها و سامانه‌ها توانایی تعلم و یادگیری پیدا می‌کنند.یکی از لذت بخش ترین و جزو 10 تا از بهترین و پرطرفدارترین شغل های جهان علوم داده است. این شغل به طور ...

    مطالب مرتبط:



    دسته بندی: دانلود » آموزش » برنامه نویسی و طراحی وب
    برچسب ها: , , , , , , , , , , , , ,
    لینک های مفید: خرید کارت شارژ, شارژ مستقیم, پرداخت قبض, خرید آنتی ویروس, خرید لایسنس آنتی ویروس, تبلیغات در اینترنت, تبلیغات اینترنتی
    © حق مطلب و تصویر برای پی سی دانلود محفوظ است | لینک دائم
    همین حالا مشترک این پایگاه شوید!


              [آموزش] دانلود Udemy Python for Data Science and Machine Learning Bootcamp – آموزش پایتون برای علوم داده و یادگیری ماشین        

    دانلود Udemy Python for Data Science and Machine Learning Bootcamp - آموزش پایتون برای علوم داده و یادگیری ماشین

    علم داده‌ ها (Data Science)، مطالعاتی پیرامون استخراج دانش و آگاهی از مجموعه‌ای داده و اطلاعات است. هدف این علم، استخراج مفهوم از داده و تولید محصولات داده‌ محور است. به شاغلین در حوزه ی علم داده، داده پژوه (data scientist) می گویند. یکی از شاخه‌های وسیع و پرکاربرد هوش مصنوعی، یادگیری ماشینی (Machine learning) است که به تنظیم و اکتشاف شیوه‌ها و الگوریتم‌هایی می‌پردازد که بر اساس آنها رایانه‌ها و سامانه‌ها توانایی تعلم و یادگیری پیدا می‌کنند.یکی از لذت بخش ترین و جزو 10 تا از بهترین و پرطرفدارترین شغل های جهان علوم داده است. این شغل به طور ...

    مطالب مرتبط:



    دسته بندی: دانلود » آموزش » برنامه نویسی و طراحی وب
    برچسب ها: , , , , , , , , , , , , ,
    لینک های مفید: خرید کارت شارژ, شارژ مستقیم, پرداخت قبض, خرید آنتی ویروس, خرید لایسنس آنتی ویروس, تبلیغات در اینترنت, تبلیغات اینترنتی
    © حق مطلب و تصویر برای پی سی دانلود محفوظ است | لینک دائم
    همین حالا مشترک این پایگاه شوید!


              How to optimize for inference a simple, saved TensorFlow 1.0.1 graph?        

              à¸à¸¹à¹€à¸à¸´à¸¥à¸­à¸­à¸ MobileNets โมเดลประมวลผล AI ด้วย TensorFlow บนมือถือ กินพลังงานต่ำ        

    กูเกิลประกาศออกชุดโมเดล MobileNets สำหรับการประมวลผล AI บนสมาร์ทโฟนที่มีทรัพยากรจำกัด ตามแผนการผลักดัน TensorFlow ให้ทำงานบนมือถือได้

    กูเกิลบอกว่าถึงแม้เราอยู่ในยุคของคลาวด์ สามารถเรียกประมวลผลภาพได้ผ่านบริการอย่าง Cloud Vision API แต่ก็มีกรณีที่จำเป็นต้องประมวลผล AI แบบออฟไลน์บนมือถือ ซึ่งช่วงหลังมีสมรรถนะสูงมากพอแล้ว

    MobileNets เป็นโมเดลสำหรับประมวลผลภาพ (computer visions models) แบบ mobile-first ตัวแรก ตัวมันเองมีขนาดเล็ก ใช้พลังงานต่ำ โดยผู้ใช้งานสามารถเลือกระดับความแม่นยำ (accuracy) ต่ำสุดในระดับที่ยอมรับได้ทั้งหมด 16 ระดับ หรือถ้าไม่พอก็สามารถปรับแต่งค่าพารามีเตอร์ได้เอง

    MobileNets ต้องใช้กับ TensorFlow Mobile ที่ทำงานได้บน Android, iOS, Raspberry Pi

    ที่มา - Google Research Blog

    No Description

    No Description


              Qualcomm announces Snapdragon 660 and 630        
    Qualcomm announces Snapdragon 660 and 630


    14nm and Kryo for mainstream

    Qualcomm has gained a lot of momentum and managed to gain market share in the mainstream from MediaTek and others and the company has just announced the Snapdragon 660 and 630 that will help it to succeed in the future. 

    The Snapdragon 660 will replace the Snapdragon 653 while the Snapdragon 630 replaces the Snapdragon 626. Both new SoCs are manufactured in 14nm FinFET and are pin and software compatible with its predecessors.

    Compared to Snapdragon 653, the new Snapdragon 660 has a 20 percent faster CPU and this is the first mainstream SoC to launch with Kryo cores. Qualcomm didn’t go into many details but it did reveal the new Kryo 260 cores. The Snapdragon 835 comes with Kryo 280 custom ARM based cores, and we can only assume that the Kryo 260 is the variation of the core. Since the Snapdragon 660 is a 14nm SoC, we have a strong feeling that the Kryo 260 looks a lot like Snapdragon 820 Kryo cores.

    The Kryo 260 performance quad core cluster runs at up to 2.2 GHz and has 1MB L2 cache while the efficiency cluster of Kryo 260 cores runs at up to 1.8 GHz and has 1MB L2 cache.

    S660

    The GPU got a real boost - you can expect to see 30 percent faster performance from Adreno 512 compared to a Snapdragon 653 GPU.

    The new mainstream SoC supports Snapdragon X12 modem with Cat 13 uplink and Cat 12 downlink support. Mainstream phones powered with this SoC will be able to match iPhone 7 top modem capabilities with speeds of up to 600 Mbps download and 150 Mbps upload.

    The modem supports 3x20 MHz carrier aggregation, 256-QAM and up to 4x4 MIMO on one carrier.

    On the uplink side, it supports Qualcomm Snapdragon Upload+, 2x20 MHz carrier aggregation, up to 2x 75Mbps LTE streams, 64-QAM and uplink data compression. Bear in mind that the Snapdragon X12 was first time revealed with Snapdragon 820 phones, and just one year later, this top speed technology is reaching mainstream phones.

    S660a

    It also supports 2x2 MU-MIMO 802.11ac Wi-Fi with speeds up to 867 Mbps, Hexagon 680 DSP, Spectra 160 Image signal processor for better photos and up to 8 GB 1866 MHz LPDDR4. Spectra 160 ISP is ready for dual camera mainstream phones and it supports PDAF autofocus and hybrid autofocus.

    The Hexagon 680 DSP will help out in machine learning algorithms as it includes  Snapdragon Neural Processing Engine SDK support.  Machine learning and a SDK support is getting to the mainstream phones with TensorFlow and Caffe/Caffe2 frameworks support on Snapdragon 660 / 630 Hexagon DSP. Optimized software libraries include support for TensorFlow and Halide. Both platforms also support Qualcomm All-Ways Aware technology with support for the Google Awareness API.

    This technology gives Qualcomm Technologies’ next generation of "always-on" contextual experiences and uses very low power running on the Hexagon DSP.

    Quick Charge promises to charge your phone safely, much faster than before. Taking this standard to the mainstream will speed up  adoption.

    This is, again, the first mainstream chip to support the Bluetooth 5.0

    The Qualcomm Snapdragon 630 also comes with Snapdragon X12 modem as well as Spectra 160 ISP but it has a modified Hexagon 642 with all aware technology. This is another octa core with four cores clocked at 2.2 GHz and four cores based on Cortex A53 clocked at 1.8 GHz.  

    S630

    It comes with Adreno 508 that should be 30 percent faster than the Adreno with Snapdragon 626. The SoC supports 8 GB memory with 1333 MHz speed LPDDR4 as well as Snapdragon X12 modem.

    The Snapdragon 660 is expected to ship to partners now while the Snapdragon 630 should start shipping toward the end of the month. You can expect to see them in the phone in the next quarter. These two are really stepping up the manistream game. 


              TensorFlow        
    TensorFlow™ is an open source software library for numerical computation using data flow graphs. Nodes in the graph represent mathematical operations, while the graph edges represent the multidimensional data arrays (tensors) communicated between them. The flexible architecture allows you to deploy computation to one or more CPUs or GPUs in a desktop, server, or mobile [...]
              2017年最值得关注的5家深度学习创业公司七日离婚契约9        
    2016年,人工智能成为了主流。谷歌(微博)CEO桑达尔·çš®æŸ¥ä¼Šï¼ˆSundar Pichai)甚至认为,科技行业将从“移动优先”转向“人工智能优先”。 苹果将人工智能技术用于iPhone,谷歌则将这类技术用于Pixel手机。Facebook的消息流中有人工智能的身影,而微软Word也用到了人工智能。三星收购了人工智能创业公司Viv,以追赶苹果的Siri。Skype和Messenger等消息应用也已集成了人工智能聊天机器人。 在人工智能领域,很大一部分研究集中于深度学习。深度学习技术利用大量数据去训练人工神经网络,让这样的神经网络有能力处理新的数据。过去5年中,越来越多深度学习创业公司正在崛起。 2016年,芯片巨头英特尔收购了深度学习硬件和软件开发商Nervana,而企业软件公司Salesforce则收购了MetaMind,后者开发的深度学习软件能迅速处理大量图片和文字。Nervana和MetaMind都曾登上过VentureBeat的2015年最值得关注深度学习创业公司榜单。与此同时,登上2016年榜单的创业公司都在全速前进。 以下是2017年最值得关注的5家深度学习创业公司: 1、Bay Labs 多家创业公司正试图用深度学习技术去处理医学影像,而Bay Labs是其中之一。该公司的团队中有多名工程人才,包括此前从事谷歌Project Loon项目的约翰·æ¢…斯(Johan Mathe)。Facebook人工智能研究集团总监延恩·å‹’昆(Yann LeCun)已投资了这家创业公司,而其他投资方还包括Khosla Ventures。 2、Cerebras Systems Cerebras是一家神秘的创业公司,领头人是安德鲁·è´¹å°”德曼(Andrew Feldman)。他此前将自己的小型服务器公司SeaMicro以3.34亿美元的价格出售给了AMD。费尔德曼新的创业公司开发人工智能硬件。根据消息人士的说法,知名风投Benchmark领投了该公司一轮超过2000万美元的投资。费尔德曼拒绝对此置评。 3、Deep Vision Deep Vision来自加州帕洛阿尔托,开发低功耗芯片,用于深度学习。该公司的两名联合创始人雷汉·å“ˆå¯†å¾·ï¼ˆRehan Hammed)和瓦杰哈特·å¡è¿ªå°”(Wajahat Qadeer)在斯坦福大学求学期间写出了一篇有趣的论文,主题是“卷积引擎芯片多处理器”。 4、Graphcore Graphcore开发智能处理单元(IPU)PCIe加速器。神经网络可以使用这样的加速器去训练或推理。这家创业公司也在开发软件,使现有的MXNet和TensorFlow深度学习框架支持其基础设施。投资方包括Bosch Venture Capital、Foundation Capital,以及三星Catalyst Fund。 5、ViSenze ViSenze创立于2012年。在2016年的ImageNet图像识别大赛中,该公司在某些环节中胜过了竞争对手。该公司的投资方包括乐天风投。ViSenze分拆自NExT,一个由新加坡国立大学和清华大学成立的研究中心。该公司的软件能完成图片视频中的对象识别和标记,提供在视觉上相似的内容。      
              Easier, faster: The next steps for deep learning        

    If there is one subset of machine learning that spurs the most excitement, that seems most like the intelligence in artificial intelligence, it’s deep learning. Deep learning frameworks—aka deep neural networks—power complex pattern-recognition systems that provide everything from automated language translation to image identification.

    Deep learning holds enormous promise for analyzing unstructured data. There are just three problems: It’s hard to do, it requires large amounts of data, and it uses lots of processing power. Naturally, great minds are at work to overcome these challenges.  

    What’s now brewing in this space isn’t just a clash of supremacy between competing deep learning frameworks, such as Google’s TensorFlow versus projects like Baidu’s Paddle. Rivalry between multiple software frameworks is a given in most any part of IT.

    To read this article in full or to leave a comment, please click here


              What deep learning really means        

    Perhaps the most positive technical theme of 2016 was the long-delayed triumph of artificial intelligence, machine learning, and in particular deep learning. In this article we'll discuss what that means and how you might make use of deep learning yourself.

    Perhaps you noticed in the fall of 2016 that Google Translate suddenly went from producing, on the average, word salad with a vague connection to the original language to emitting polished, coherent sentences more often than not -- at least for supported language pairs, such as English-French, English-Chinese, and English-Japanese. That dramatic improvement was the result of a nine-month concerted effort by the Google Brain and Google Translate teams to revamp Translate from using its old phrase-based statistical machine translation algorithms to working with a neural network trained with deep learning and word embeddings employing Google's TensorFlow framework.

    To read this article in full or to leave a comment, please click here


              Import Python: Import Python 136        
    Worthy Read

    I faced an interesting challenge at work the other day. I felt like sharing because it might save a few hours for others, or reveal some insights about the Python internals.
    python object

    In this video series, we will be tackling Python Regular Expressions. The first few videos we will go over the basics, and then tackle some intermediate problems using Python Regular Expressions.
    regular expression

    eSignature API Integration. HelloSign eSign API. Test the API for free.
    sponsor

    process pool

    In this tutorial, I’ll be taking you through the basics of developing a vehicle license plate recognition system using the concepts of machine learning with Python.
    machine learning

    core-python
    ,
    code snippets

    logging beyond 101
    logging

    We will see in this article how to detect if an image contains celebrities with Sightengine.
    machine learning

    Curator's Note - I am a big Game of Thrones fan so had to share this. As a fan of Game of Thrones, I couldn’t wait for it to return for a 7th season. Watching the season premier, I greatly enjoyed that iconic scene of Sam doing his chores at the Citadel. I enjoyed it so much that I wanted to see more of it… much more of it. In this post we’ll take the short video compilation of Sam cleaning the Citadel, we will split it to multiple sub clips and create a video of Sam cleaning the Citadel using a random mix of those sub clips.
    video processing

    The aim of this short notebook is to show how to use NumPy and SciPy to play with spectral audio signal analysis (and synthesis).
    numpy
    ,
    scipy

    Every once in a while it is useful to take a step back and look at pandas’ functions and see if there is a new or better way to do things. I was recently working on a problem and noticed that pandas had a Grouper function that I had never used before. I looked into how it can be used and it turns out it is useful for the type of summary analysis I tend to do on a frequent basis.
    pandas

    For any program that is used by more than one person you need a way to control identity and permissions. There are myriad solutions to that problem, but most of them are tied to a specific framework. Yosai is a flexible, general purpose framework for managing role-based access to your applications that has been decoupled from the underlying platform. This week the author of Yosai, Darin Gordon, joins us to talk about why he started it, his experience porting it from Java, and where he hopes to take it in the future.
    podcast

    Recently, I worked on a Python project that required the whole codebase to be protected using Cython. Although protecting Python sources from reverse engineering seems like a futile task at first, cythonizing all the code leads to a reasonable amount of security (the binary is very difficult to disassemble, but it's still possible to e.g. monkey patch parts of the program). This security comes with a price though - the primary use case for Cython is writing compiled extensions that can easily interface with Python code. Therefore, the support for non-trivial module/package structures is rather limited and we have to do some extra work to achieve the desired results.
    cpython

    The complication arises when invoking awaitable functions. Doing so requires an async defined code block or coroutine. A non-issue except that if your caller has to be async, then you can’t call it either unless its caller is async. Which then forces its caller into an async block as well, and so on. This is “async creep”.
    asyncio

    Maybe you’ve heard about it in preparing for coding interviews. Maybe you’ve struggled through it in an algorithms course. Maybe you’re trying to learn how to code on your own, and were told somewhere along the way that it’s important to understand dynamic programming. Using dynamic programming (DP) to write algorithms is as essential as it is feared.
    algorithms

    pandas

    mrjob
    ,
    mapreduce

    Today, let’s use TensorFlow to build an artificial neural network that detects fake banknotes.
    tensorflow

    What would you do if you wanted to know which files are the most similar to a particular text-based file? For example to find a particular configuration file which has changed its filename and its contents.
    project


    Jobs

    London, United Kingdom
    Forward Partners is the UK's largest dedicated seed stage VC with £80m AUM. We focus on next-generation eCommerce companies and applied AI startups.


    Projects

    pytorch-nice - 53 Stars, 1 Fork
    Support powerful visual logging in PyTorch.

    CryptoTracker - 52 Stars, 2 Fork
    A complete open source system for tracking and visualizing cryptocurrency price movements on leading exchanges.

    Imports-in-Python - 41 Stars, 4 Fork
    A guide on how importing works in Python.

    Tensorflow solve minesweeper.

    Baidu-Dogs - 19 Stars, 0 Fork
    Baidu competition for classifying dogs.

    EffectiveTensorflow - 4 Stars, 1 Fork
    Guides and best practices for effective use of Tensorflow.

    minimal_flight_search - 3 Stars, 0 Fork
    A minimalist flight search engine written in Python.

    django_rest_example - 3 Stars, 0 Fork
    Django/DRF rest application example.

    ytsearch - 0 Stars, 0 Fork
    A program to search and view YouTube videos.


              Talk Python to Me: #124 Python for AI research        
    We all know that Python is a major player in the application of Machine Learning and AI. That often involves grabbing Keras or TensorFlow and applying it to a problem. But what about AI research? When you're actually trying to create something that has yet to be created? How do researchers use Python here? <br/> <br/> Today you'll meet Alex Lavin, a Python developer and research scientist at Vicarious where they are trying to develop artificial general intelligence for robots.<br/> <br/> Links from the show:<br/> <br/> <div style="font-size: .85em;"><b>Alex on the web</b>: <a href="https://www.lavin.io/" target="_blank">lavin.io</a><br/> <b>Alex on Twitter</b>: <a href="https://twitter.com/theAlexLavin" target="_blank">@theAlexLavin</a><br/> <b>Vicarious</b>: <a href="http://www.vicarious.com/" target="_blank">vicarious.com</a><br/> <b>NOVA's Great Robot Race Documentary</b>: <a href="https://www.youtube.com/watch?v=vCRrXQRvC_I" target="_blank">youtube.com</a><br/></div>
              10 great GitHub repositories focusing on IPython, TensorFlow and Theano        
    This is a collection of 10 great GitHub repositories focusing on IPython, TensorFlow, Theano and related topics, for data scientists. The last one is not on GitHub. http://www.datasciencecentral.com/profiles/blogs/top-10-ipython-tutorials-for-data-science-and-machine-learning?utm_content=bufferef7f4&utm_medium=social&utm_source=twitter.com&utm_campaign=buffer
              [書籍紹介]分析したいけどデータがない!そんなときはウェブスクレイピング!        

    献本いただいたので紹介記事を書きます。

    本書はウェブスクレイピングという、ウェブ上の情報を収集する技術について書かれた本です。

    ウェブスクレイピングはデータ分析者にとって非常に有用な技術です。

    データ分析をしようと思ったとき、どうやってデータを集めるかというのが問題になる場合があります。

    例えばテキスト解析をしようと思っても、ちょうどいいデータセットが手元になく、自分でデータを集める必要があるといった場合があります。

    その場合、Wikipedia や Twitter などからデータを収集するためには、ウェブ技術やプログラミングについて学ぶ必要が出てきます。

    例えばウェブ技術の一つである HTML を学ぶとなると、たくさんのマークアップタグがあり、いくつかのバージョンが混在し、ブラウザによっても微妙に異なるといったことが多くの書籍には書かれているわけですが、こういった知識は、単にウェブからデータを収集したいだけの分析者には必要のない知識だったりします。

    本書は、そういったウェブ技術やプログラミングに疎い分析者が、ウェブ上からデータを収集するために必要な最低限の知識を学ぶための最短コースを提供しています。

    内容としては初心者向けに一通りのことが書かれており、知識はないけど手っ取り早くできるようになりたいという人が最初に手に取るのにちょうどいいと思います。

    本書で使用される統計言語 R では、ウェブスクレイピングを簡単にするためのパッケージが利用でき、プログラミング初心者でもわりと容易に自分の思った通りのスクレイピングコードが書けるようになると思います。

    章別に見ていくと、次のような感じです。

    1. Rの基礎知識
    2. ウェブ技術
      • HTML, CSS, XPath, XML, JSON, HTTP, OAuth
      • httrパッケージ, XMLパッケージ
      • 正規表現(stringrパッケージ)
    3. ウェブ API 入門
      • Wikipedia, Facebook, Yahoo!, Twitter
    4. ウェブスクレイピング実践編
      • rvestパッケージ, Selenium (RSeleniumパッケージ)
    5. ウェブ API 実践編
      • e-Stat, Google Cloud Vision, GitHub
    6. オープンデータ

    1章ではRの基礎知識を簡単に説明しています。ただし、ウェブスクレイピングに必要な知識だけに絞って簡潔に説明されています。

    2章は基本的なウェブ技術についてです。ここも要点をしぼって説明されていますが、結構なページ数がさかれています。

    例えば、CSSセレクタやXPathは、ウェブドキュメントから自分の欲しい部分だけを抜き出すときに、その要素の指定に使われる重要な技術です。

    また、抜き出した情報にも余分な情報がある場合もあり、そのような場合は正規表現を使った文字列処理が必要になります。

    XML や JSON はウェブから情報を取得する際に返却されるデータ形式としてよく使われるものであり重要です。

    このように、本書では膨大なウェブ技術の中でもウェブスクレイピングに必要な情報だけに絞って解説しているというところがポイントだと思います。

    3章はウェブ API についてです。ウェブスクレイピングというと HTML ページを解析して情報を抜き出すというイメージがありますが、サイト側が用意した API があるならば、それを使うことが推奨されます。これについては著者自身がブログで説明していますのでそちらをご参照ください。

    4章と5章は実践編です。ここでは様々なサイトから実際にデータを抜き出すための方法が具体的なコードと共に提示されます。

    6章はオープンデータについてです。現在さまざまな団体がデータを公開しており、これを活用したデータ分析について書かれています。

    まとめ

    本書はデータ分析者にとって重要な技術であるウェブスクレイピングについて、初心者向けに要点をしぼって解説された本です。

    分析したいけどデータがないとお困りの方、ウェブスクレイピングを手っ取り早く学びたいという方におすすめの一冊です。

    関連

    Python によるスクレイピング

    Python によるスクレイピングの本も立て続けに出ていますね。


              53: Machine Learning Mad Scientists - David Crook on Technology        

    David Crook is a Microsoft Developer Evangelist formerly of Microsoft Consulting Services.  He is focused on Data Science, Machine Learning and High Performance Computing.  He is also known as the Mad Scientist on his team.

    Show notes at http://hellotechpros.com/david-crook-technology/

    Key Takeaways

    • Are you a mad scientist?
      • Scientist = someone who pursues knowledge in a structured way
      • Mad = uncontrollable and disastrous
    • Twitter is a cesspit of language
    • Studying the interactions of AI is really the study of how real humans interact
    • Many experts in the field of data science are mathematicians first and programmers second
      • It's all about being able to prove your results and make sure they are consistent and reproducible
      • Using math models to represent the data instead of a lot of if-then-else blocks
    • SCADA systems in oil and gas industry has very complex problems of monitoring 100,000's of data points and analyzing for leaks, equipment failures, etc
    • Azure machine learning tool has given the old systems an easy UI
      • Experts can build and train a model in less than a day, used to take months
      • Allows data scientists to solve different types of problems faster
      • Get nice added features that didn't get before
    • The amount of machine learning capabilities available on high-end cell phones are incredible
      • Some problems may be able to move out of the cloud and onto the device
      • You can distribute map/reduce problems to cell phones
    • More programming languages are supporting machine learning frameworks, meaning more developers can solve machine learning problems
    • If you want to get into machine learning, just start building
      • Find a problem and start working on it from a mathematical perspective

    Resources Mentioned

    Sponsors

    • BookMoreNights.com - Is your vacation rental property fully booked 100% of the time? BookMoreNights.com helps with your rental property marketing to book more nights.

              Handwritten digits recognition using google tensorflow with python        

    Handwritten digits recognition using Tensorflow with Python The progress in technology that has happened over the last 10 years is unbelievable. Every corner of the world is using the top most technologies to improve existing products while also conducting immense research into inventing products that make the world the best place to live. Some of
    + Read More

    The post Handwritten digits recognition using google tensorflow with python appeared first on Dataaspirant.


              Machine Learning with Python Course and E-Book Bundle for $49        
    4 E-Books & 5 Courses to Help You Perform Machine Learning Analytics & Command High-Paying Jobs
    Expires January 22, 2022 23:59 PST
    Buy now and get 92% off

    Deep Learning with TensorFlow


    KEY FEATURES

    Deep learning is the intersection of statistics, artificial intelligence, and data to build accurate models, and is one of the most important new frontiers in technology. TensorFlow is one of the newest and most comprehensive libraries for implementing deep learning. Over this course you'll explore some of the possibilities of deep learning, and how to use TensorFlow to process data more effectively than ever.

    • Access 22 lectures & 2 hours of content 24/7
    • Discover the efficiency & simplicity of TensorFlow
    • Process & change how you look at data
    • Sift for hidden layers of abstraction using raw data
    • Train your machine to craft new features to make sense of deeper layers of data
    • Explore logistic regression, convolutional neural networks, recurrent neural networks, high level interfaces, & more

    PRODUCT SPECS

    Details & Requirements

    • Length of time users can access this course: lifetime
    • Access options: web streaming, mobile streaming
    • Certification of completion not included
    • Redemption deadline: redeem your code within 30 days of purchase
    • Experience level required: all levels

    Compatibility

    • Internet required

    THE EXPERT

    Dan Van Boxel is a Data Scientist and Machine Learning Engineer with over 10 years of experience. He is most well-known for "Dan Does Data," a YouTube livestream demonstrating the power and pitfalls of neural networks. He has developed and applied novel statistical models of machine learning to topics such as accounting for truck traffic on highways, travel time outlier detection, and other areas. Dan has also published research and presented findings at the Transportation Research Board and other academic journals.

    Beginning Python


    KEY FEATURES

    Python is the general purpose, multi-paradigm programming language that many professionals consider one of the best beginner language due its relative simplicity and applicability to many coding arenas. This course assumes no prior experience and helps you dive into Python fundamentals to come to grips with this popular language and start your coding odyssey off right.

    • Access 43 lectures & 4.5 hours of content 24/7
    • Learn vari