Introducing Caer — A GPU-accelerated Computer Vision Library

A Python library that changes the way your approach towards Machine VisionJason DsouzaJust now·4 min readCaer — GPU-accelerated Image & Video Processing. Image by Author.When I released Caer back in August of this year, I have received hundreds of emails from researchers and computer vision enthusiasts around the world thanking me for releasing the library. Their good (and bad) feedback pushed and motivated me to take the library to another level.Today, I’m excited to announce the first-ever stable release of Caer, a lightweight open-source Python library that simplifies the way you approach Computer Vision. It abstracts away unnecessary boilerplate code enabling maximum flexibility. By offering powerful image and video processing algorithms, Caer provides both casual and advanced users with an elegant interface for Machine vision operations.It leverages the power of libraries like OpenCV and Pillow to speed up your Computer Vision workflow — making it ideal if you want to quickly test out something.This design philosophy makes Caer ideal for students, researchers, hobbyists and even experts in the fields of Deep Learning and Computer Vision to quickly prototype deep learning models or research ideas.jasmcaus/caerA lightweight Computer Vision library for high-performance AI research. Caer contains powerful image and video processing operations…www.github.comWhat is Caer?Caer is a GPU-accelerated Computer Vision library in Python that’s designed to help speed up your Computer Vision workflow.It’s ideal for rapid prototyping so you can focus more on the experimenting rather than the building. I use this package every single day when working on image and video processing workflows and it saves me tons of time!Installing CaerThe latest release of Caer can be installed via a simple pip installpip install –upgrade caerRead the complete Installation Guide for more platform-specific download instructions.Using CaerI recommend going through the documentation for a look at all the methods in Caer.1. Standard Test ImagesCaer currently ships out of the box with 29 high-quality images from Unsplash. These are extremely handy if you want to test out a feature quickly. Simply call caer.data.() to get a standard 640×427 image.Read the documentation for details on all the images you can reference.To get this image, simply call caer.data.beverages(). Image by Author2. Advanced ResizingMost libraries today like OpenCV and Pillow perform hard-resizing, meaning that you lose the original aspect ratio of your image. When training Deep Neural Networks, this is not such a big deal, but in other cases, it makes a big difference.caer.resize() resizes your images to a certain target size (400×400, for instance) while still maintaining the original aspect ratio. Behind the scenes, it uses an advanced cropping mechanism that crops out the most useful part of the image.We are currently working on a Context-Aware smart image resizer to retain the most useful information in your image without the need for cropping.# A standard 640×427 image > > img = caer.data.sunrise()# Resizing to 400×400 maintaing aspect ratio > > resized = caer.resize(img, (400,400), keep_aspect_ratio=True) > > plt.imshow(resized) > > plt.show()Image by Author3. Translation and RotationTranslating an image in Caer is as easy as calling caer.translate() . Behind the scenes, it defines a translation matrix and translates the image.Rotation follows the same principle — a rotation matrix defined is defined and the image is rotated).# Shifts an image 50 pixels to the right and 100 pixels up > > translated = caer.translate(img, 50, -100)# Rotates an image around the centre counter-clockwise by 45 degrees > > rotated = caer.rotate(img, 45, rotPoint=None)Image by Author4. Batch Pre-processingGot several hundred images and want to quickly compute the mean pixel intensity?caer.preprocessing.compute_mean_from_dir() iterates over all the images in a directory and returns a tuple of the average mean intensities which can be used to perform mean subtraction.# Computes the mean per channel of the image > > mean = caer.preprocessing.compute_mean_from_dir(path, channels=3, per_channel_subtraction=True) > > mean(56.935615485948475, 79.85257611241218, 100.95970799180328)# Subtracting the mean using these values > > mp = MeanProcess(mean, channels=3) > > sub = mp.mean_preprocess(img, channels=3) > > plt.imshow(sub) > > plt.show()Image by AuthorWhat’s Next?Caer is by no means an attempt to reinvent the wheel. In fact, we utilize backend frameworks like OpenCV to ensure maximum flexibility and performance for your Computer Vision models.We are actively working to improve Caer’s functionality (contributions welcome!). In the coming days, we will be releasing a context-aware image resizer that we’ve been testing out for weeks. If you’d like to request a specific functionality, you can do so on our Github page or tweet me!Useful LinksGithub RepoDocumentationContribute to the codebaseTweet about us!If you like Caer, give us a ⭐️ on the repo.

Read More

Nvidia’s Q3 revenues rise 57% to $4.73 billion as gaming and datacenters stay strong

Nvidia reported revenues of $4.73 billion for its third fiscal quarter ended October 25, up 57% from a year earlier. The revenues and non-GAAP earnings per share of $2.91 a share beat expectations as new gaming hardware and AI products generated strong demand.

Read More

NVIDIA chucks its MLPerf-leading A100 GPU into Amazon’s cloud

NVIDIA’s A100 set a new record in the MLPerf benchmark last month and now it’s accessible through Amazon’s cloud. Amazon Web Services (AWS) first launched a GPU instance 10 years ago with the NVIDIA M2050. It’s rather poetic that, a decade on, NVIDIA is now providing AWS with the hardware to power the next generation…

Read More

NVIDIA sets another AI inference record in MLPerf

NVIDIA has set yet another record for AI inference in MLPerf with its A100 Tensor Core GPUs. MLPerf consists of five inference benchmarks which cover the main three AI applications today: image classification, object detection, and translation.

Read More