Absstract of: US20260105316A1
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for training an actor neural network used to select actions to be performed by an agent interacting with an environment. One of the methods includes obtaining a minibatch of experience tuples; and updating current values of the parameters of the actor neural network, comprising: for each experience tuple in the minibatch: processing the training observation and the training action in the experience tuple using a critic neural network to determine a neural network output for the experience tuple, and determining a target neural network output for the experience tuple; updating current values of the parameters of the critic neural network using errors between the target neural network outputs and the neural network outputs; and updating the current values of the parameters of the actor neural network using the critic neural network.
Absstract of: US20260105727A1
To train a computer vision model to classify health test kit results, a computing system obtains a plurality of training images. Each training image depicts a plurality of health test results in respective segments of a test membrane of a health test kit. The computing system obtains, for each training image, labeling indicating the health test results depicted by the training image. The computing system trains a plurality of local Convolutional Neural Networks (CNNs) of the computer vision model in parallel. Each of the local CNNs is trained to predict the health test result depicted in a respective one of the segments based on local features extracted by the local CNN from the respective one of the segments of each training image and global features extracted by a global CNN of the computer vision model from the test membrane of each training image.
Absstract of: AU2024367742A1
A method of operation of an unmanned aerial vehicle (UAV) service includes acquiring aerial images of a scene at an area of interest (AOI), wherein the aerial images are acquired with a UAV of the UAV service during a flight mission of the UAV that passes over the AOI; uploading a mission log of the flight mission to a backend data system of the UAV service, the mission log including image data that includes, or is derived from, at least a portion of the aerial images; and training a neural radiance field (NeRF) model with one or more of the aerial images, wherein the NeRF model comprises a neural network, which after the training, encodes a volumetric representation of the scene capable of generating novel views of the scene different than any of the aerial images used to train the NeRF model.
Absstract of: US20260104336A1
Provided is a Charpy impact specimen notch inspector and a use method thereof. The Charpy impact specimen notch inspector includes an inspection table, where a specimen auto-alignment mechanism is provided on a top of the inspection table; a support rod is fixedly connected to a rear side of the top of the inspection table; a host is fixedly connected to a front side of the support rod; and a switch button and a universal serial bus (USB) interface are provided at a left side of the host. This application can automatically recognize and align the position of the to-be-inspected Charpy impact specimen, without adjusting the Charpy impact specimen back and forth, thereby improving the inspection efficiency. Moreover, this application can directly compare the notch picture of the Charpy impact specimen with the corresponding standard model based on the residual neural network in inspection to determine whether the notch is qualified.
Absstract of: US20260105989A1
Classification of cancer condition, in a plurality of different cancer conditions, for a species, is provided in which, for each training subject in a plurality of training subjects, there is obtained a cancer condition and a genotypic data construct including genotypic information for the respective training subject. Genotypic constructs are formatted into corresponding vector sets comprising one or more vectors. Vector sets are provided to a network architecture including a convolutional neural network path comprising at least a first convolutional layer associated with a first filter that comprise a first set of filter weights and a scorer. Scores, corresponding to the input of vector sets into the network architecture, are obtained from the scorer. Comparison of respective scores to the corresponding cancer condition of the corresponding training subjects is used to adjust the filter weights thereby training the network architecture to classify cancer condition.
Absstract of: US20260104249A1
In accordance with some embodiments, systems and methods are provided for training a neural network using simulated data so that the neural network can determine correspondence between a projection pattern and an image of the projection pattern shone onto the surface of an object. The trained neural network can be used to output a correspondence between respective pixels in an image of a real object with the projection pattern projected thereon and coordinates of the projection pattern. The method can further include, using the correspondence between respective pixels in the image and coordinates of the projection pattern, reconstructing a shape of the surface of the real object.
Absstract of: US20260105650A1
A computer-implemented method of generating multimodal data. The method comprises using a token generation neural network to generate, autoregressively, an output sequence of multimodal tokens, and in response to a next multimodal token being a start-of-image token, generating an image using an image generation subsystem conditioned on features representing the current sequence of multimodal tokens obtained from the token generation neural network. The method further comprises processing the image to convert pixels of the image into a sequence of image tokens, each image token comprising a block encoding of values of the pixels in a different region of the image that maps a set of values of the pixels to a respective image token, and appending the sequence of image tokens to the current output sequence of multimodal tokens as the next multimodal tokens in the output sequence of multimodal tokens.
Absstract of: US20260105726A1
A disaster smoke detection method includes: performing a first convolution operation on an input image to extract features to generate a primary feature map; performing enhancement processing on the primary feature map to obtain an enhanced feature map; performing multi-scale fusion on the enhanced feature map according to a second convolution operation to obtain a plurality of feature maps of different scales as high-level feature maps; respectively performing top-down feature fusion and bottom-up feature fusion on each high-level feature map at each scale according to a third convolution operation to correspondingly obtain a plurality of top-down fused feature maps and a plurality of bottom-up fused feature maps; fusing the top-down fused feature maps and the bottom-up fused feature maps to obtain a plurality of bidirectional cross-fused feature maps; and performing disaster and smoke detection on each of the bidirectional cross-fused feature maps.
Absstract of: US20260104893A1
Embodiments include systems and methods for processing sensor data and generating operational instructions of hardware of egos (e.g., autonomous vehicles, robots). The ego includes any number of machine-learning architectures, often neural network architectures, for processing sensor data and recognizing the environment around the ego and making decisions on the ego's behavior. The neural network architectures of the ego ingest sensor data and execute any number of operations related to a particular domain or task, such as object recognition or path planning, using the sensor data. A graph partitioner is trained to assign functions in the software of the neural networks and the sensor data to certain hardware processing units. Several compilers are used to generate the instructions based upon the assigned type of processing unit.
Absstract of: EP4726608A2
0001 A computing system (10) including a processor (14) configured to receive a mesh (30) of a three-dimensional geometry (38). The processor is further configured to receive a source antenna location (40) and a destination antenna location (42) on the mesh. The processor is further configured to compute a ray path (60) as an estimated shortest path between the source antenna location and the destination antenna location. The ray path includes a geodesic path (62) over the mesh and a free space path (64) outside the mesh. The ray path is computed at least in part by computing the geodesic path at least in part by performing inferencing at a trained neural network (52). Computing the ray path further includes computing the free space path at least in part by performing raytracing from a launch point (66) located at an endpoint of the geodesic path. The processor is further configured to output the ray path to an additional computing process (70).
Absstract of: WO2024254102A1
Various embodiments discussed herein are directed to improving hardware consumption and computing performance by performing neural network operations on dense tensors using sparse value information from original tensors. Such dense tensors are condensed representations of other original tensors that include zeros or other sparse values. In order to perform these operations, particular embodiments provide an indication, via a binary map, of a position of where the sparse values and non-sparse values are in the original tensors. Particular embodiments additionally or alternatively determine shape data of the original tensors so that these operations are accurate.
Absstract of: EP4726672A1
0001 A method for identifying objects in a set of images which are visually similar to reference objects comprises: receiving an indication of regions where reference objects are depicted; forming a reference pool with data records representing the reference objects; generating a text embedding (TE) and a visual embedding (VE) for each of the reference objects, by applying neural networks to image data from the regions where the reference objects are depicted; detecting a plurality of candidate objects in the set of images, and generating a TE and a VE for each; approving a detected candidate object for addition to the reference pool only if the reference pool contains a reference object fulfilling a first similarity criterion (C<1>), which depends both on TE similarity and VE similarity, in relation to the detected candidate object; extending the reference pool by adding all approved candidate objects; and identifying a detected candidate object as visually similar to the reference objects only if the extended reference pool contains a reference object fulfilling a second similarity criterion (C<2>), which depends on VE similarity, in relation to the detected candidate object.
Absstract of: EP4726579A1
0001 A graph neural network-based node classification method and system, and a related device are provided. In the method, during training of a model, categories of a plurality of neighboring node samples in a graph data sample are first predicted to obtain category distribution of the plurality of neighboring node samples, and sampling is then performed on the plurality of neighboring node samples based on the category distribution and a sampling parameter input by a user to obtain a plurality of sampled nodes, so that category distribution of the plurality of sampled nodes is similar to or consistent with the category distribution of the plurality of neighboring node samples. In this way, features of the sampled nodes obtained through sampling can cover features of all neighboring nodes, thereby reducing calculation complexity. In addition, the category distribution of the sampled nodes is closer to true distribution of the neighboring nodes, thereby improving performance of a graph neural network.
Absstract of: EP4725391A1
0001 A method and device for diagnosing renal disease are disclosed. A control method of a diagnostic device according to one embodiment comprises: obtaining a retinal image of a subject; and obtaining renal disease diagnostic information regarding the subject using a machine learning model based on the retinal image, wherein the machine learning model includes a first model and a second model, wherein the first model is a neural network model, and wherein the second model is a regression-based machine learning model.
Absstract of: EP4726674A1
Embodiments of the present disclosure provide a method for training a fingerprint anti-counterfeiting neural network, a method for fingerprint anti-counterfeiting, an apparatus for training a fingerprint anti-counterfeiting neural network, and an apparatus for fingerprint anti-counterfeiting, comprising: obtaining a plurality of groups of training data, each group of the training data comprising: first raw domain data, second raw domain data, and third raw domain data; and training an initial classification network using the plurality of groups of training data to obtain a target classification network, wherein the initial classification network comprises a fusion subnetwork and a classification subnetwork, and for each group of the training data, the fusion subnetwork is configured to generate a first fingerprint matching pair based on a feature description matrix of the first raw domain data and the second raw domain data, and generate a second fingerprint matching pair based on a feature description matrix of the first raw domain data and the third raw domain data, and the classification subnetwork is configured to perform fingerprint classification and recognition based on the first fingerprint matching pair and the second fingerprint matching pair. The present disclosure solves the problem of low recognition accuracy rate of real and prosthetic fingerprints in related art, and achieves the effects of improving the recognition accuracy rate of real and prosthetic fingerpr
Absstract of: US20260099896A1
0000 A method of training a neural network configured to obfuscate a facial image and an electronic device for performing the method are provided. The method includes obtaining, based on an input facial image, an output facial image in which the input facial image is obfuscated, extracting, based on the input facial image, a feature of the input facial image for reconstructing identification information included in the input facial image from the output facial image, extracting, based on the output facial image, a feature of the output facial image corresponding to the feature of the input facial image, and training the neural network based on a difference between the feature of the input facial image and the feature of the output facial image.
Absstract of: US20260100030A1
Provided is an electronic apparatus including memory configured to store at least one instruction, and a processor configured to execute the at least one instruction to obtain a first image including an object, input the first image to a first neural network model that is configured to be trained by using a plurality of second images in relation to a plurality of predefined types, obtain first probability information including a first probability of the object corresponding to a first type among the plurality of types and a second probability of the object corresponding to a second type among the plurality of types, obtain second probability information, through a second neural network model, indicating a type of the object included in the first image, by using a plurality of third images corresponding to the first type and a plurality of fourth images corresponding to the second type based on a difference between the first probability and the second probability being less than a first threshold value and based on a first input, and identify the type of the object based on the second probability information.
Absstract of: US20260100186A1
0000 Disclosed is a sensor-processing system including, in some embodiments, a sensor, one or more sample pre-processing modules, one or more sample-processing modules, one or more neuromorphic integrated circuits (“ICs”), and a microcontroller. The one or more sample pre-processing modules are configured to process raw sensor data for use in the sensor-processing system. The one or more sample-processing modules are configured to process pre-processed sensor data including extracting features from the pre-processed sensor data. Each of the neuromorphic ICs includes at least one neural network configured to arrive at actionable decisions of the neural network from the features extracted from the pre-processed sensor data. The microcontroller includes a CPU along with memory including instructions for operating the sensor-processing system. In some embodiments, the sensor is a pulse-density modulation (“PDM”) microphone, and the sensor-processing system is configured for keyword spotting. Also disclosed are methods of such a keyword spotting sensor-processing system.
Absstract of: US20260100058A1
Disclosed herein are systems, methods, and devices for detecting traffic lane violations. In one embodiment, a method for detecting a potential traffic violation is disclosed comprising bounding a vehicle detected from one or more video frames of a video in a vehicle bounding box. The vehicle can be detected and bounded using a first convolutional neural network. The method can also comprise bounding, using the one or more processors of the edge device, a plurality of lanes of a roadway detected from the one or more video frames in a plurality of polygons. The plurality of lanes can be detected and bounded using multiple heads of a multi-headed second convolutional neural network. The method can further comprise detecting a potential traffic violation based in part on an overlap of at least part of the vehicle bounding box and at least part of one of the polygons.
Absstract of: US20260099963A1
Apparatuses, systems, and techniques are presented to generate or manipulate digital images. In at least one embodiment, a network is trained to generate modified images including user-selected features.
Absstract of: WO2026075588A1
A method (200) is disclosed for generating an explainability output for node level predictions generated by a GNN on an input graph. The method comprises, for individual nodes in the input graph, for each incoming neighbour node of the node, creating an ordered group comprising the node and the incoming neighbour node (210), and then combining the ordered groups into a plurality of batches, each batch comprising an ordered group (220). The method further comprises, for individual batches, and for individual nodes in the batch, identifying edges in the GNN computation graph of the node that connect to nodes outside of the batch, and detaching the identified edges in the backward pass direction (230), and using the GNN to generate, in parallel, node level predictions for the nodes in the batch by performing a forward pass through the GNN computation graphs of the nodes in the batch (240). The method further composites using a gradient based explainability method to generate, in parallel, importance scores of incoming neighbour nodes for the nodes in the batch by performing a backward pass through the GNN computation graphs of the nodes in the batch (250). The method further comprises, for individual nodes in the input graph, assembling the generated importance scores of incoming neighbour nodes into an explainability output for the node prediction generated by the GNN (260).
Absstract of: US20260100065A1
0000 The present disclosure provides an image style conversion method. The image style conversion method includes: acquiring a first image, wherein the first image includes text information; performing image processing on the first image by a text matting neural network model to obtain a text mask; performing style conversion on the text mask based on a preset application scenario to obtain a converted text mask; and performing image fusion on the converted text mask and a background image of the preset application scenario to obtain a converted image after image style conversion.
Absstract of: US20260100267A1
0000 Deep learning methods and systems for detecting biomarkers within volumetric biomedical imaging dataset using such deep learning methods and systems are provided. Embodiments predict the clinically useful biomarkers in optical coherent tomography images, ultrasound images, magnetic resonance imaging images, and computed tomography images using deep neural networks.
Absstract of: WO2026074166A1
The invention relates to a method for authenticating a product carrying a marking defined by a geometric distribution of a plurality of groups of binary modules, which method consists in capturing an initial image of the marking of the product, selecting minutiae from among its binary modules, and capturing a subsequent image on a candidate product. For each minutia, a measure of the similarity between the two images is computed. These similarity measures are organised in a matrix representing the position and the structure of the minutiae. An artificial neural network analyses this matrix to provide the authenticity class of the product.
Nº publicación: WO2026076120A1 09/04/2026
Applicant:
GDM HOLDING LLC [US]
Absstract of: WO2026076120A1
Methods, systems, and apparatus, including computer programs encoded on computer storage media, for processing videos using neural networks. In particular, the neural network has a hybrid architecture that includes both recurrent and self-attention layers.