Absstract of: US2024330679A1
A method for making predictions pertaining to entities represented within a heterogeneous graph includes: identifying, for each node in the heterogeneous graph structure, a set of node-target paths that connect the node to a target node; assigning, to each of the node-target paths identified for each node, a path type identifier indicative of a number of edges and corresponding edge types in the associated node-target path; and extracting a semantic tree from the heterogeneous graph structure. The semantic tree includes the target node as a root node and defines a hierarchy of metapaths that each individually correspond to a subset of the node-target paths in the heterogeneous graph structure assigned to a same path type identifier. The semantic tree is encoded, using one or more neural networks by generating a metapath embedding corresponding to each metapath in the semantic tree. Each of the resulting metapath embeddings encodes aggregated feature-label data for nodes in the heterogeneous graph structure corresponding to the path type identifier corresponding to the metapath associated with the metapath embedding. A label is predicted for the target node in the heterogeneous graph structure based on the set of metapath embeddings.
Absstract of: US20260037770A1
In some embodiments, a natural language input directed to an entity may be obtained. In connection with obtaining the natural language input, a vector similarity search of a database may be performed based on the natural language input to obtain one or more vectors corresponding to stored data related to the natural language input. In some embodiments, in connection with obtaining the natural language input directed to the entity, a first neural network may be used to obtain a state vector representing stored data related to the natural language input. The natural language input and the state vector may be inputted into a second neural network associated with the entity to generate a conversation response of the entity to the natural language input for presentation via the user interface. In some embodiments, the state vector may be updated using the conversation response of the entity.
Absstract of: US20260037790A1
Apparatuses, systems, and techniques to identify objects with in an image. In at least one embodiment, objects are identified in an image using one or more neural networks, in which the one or more neural networks are trained using one or more decay parameters.
Absstract of: US20260038120A1
A method for analyzing an image, called “analysis image”, of a dental arch of a patient. The method in which the analysis image is submitted to a neural network, in order to determine at least one value of an image attribute relating to the analysis image. The analysis image is a photograph or an image taken from a film. The image attribute relates to a position or an orientation or a calibration of an acquisition apparatus used to acquire the analysis image or a combination thereof, or a quality of the analysis image, and in particular relating to the brightness, to the contrast or to the sharpness of the analysis image, or a combination thereof.
Absstract of: US20260037802A1
Systems and methods for training a machine learning model implemented over a network configured to represent the machine learning model are provided. At least one or more directed edges connect the one or more nodes with an edge representing a connection between a first node and a second node, the second node computing an activation depending on the values of activations on first nodes and values associated with the connections, the connection being either conforming or non-conforming. The machine learning model may be trained by iteratively adjusting parameters w and b, respectively associated with weights and biases associated with edges connecting computational nodes. Connections between nodes may be sparsified by adjusting the parameter w to a first value for non-conforming connections during the training phase to reduce complexity of the connections among the plurality of nodes, or to ensure the input-output function of the network adheres to additional constraints.
Absstract of: US20260038489A1
Two-pass automatic speech recognition (ASR) models can be used to perform streaming on-device ASR to generate a text representation of an utterance captured in audio data. Various implementations include a first-pass portion of the ASR model used to generate streaming candidate recognition(s) of an utterance captured in audio data. For example, the first-pass portion can include a recurrent neural network transformer (RNN-T) decoder. Various implementations include a second-pass portion of the ASR model used to revise the streaming candidate recognition(s) of the utterance and generate a text representation of the utterance. For example, the second-pass portion can include a listen attend spell (LAS) decoder. Various implementations include a shared encoder shared between the RNN-T decoder and the LAS decoder.
Absstract of: US20260038529A1
A method, apparatus, and non-transitory computer-readable medium for automatic speech recognition using conditional factorization for bilingual code-switched and monolingual speech may include receiving an audio observation sequence comprising a plurality of frames, the audio observation sequence including audio in a first language or a second language. The approach may further include mapping the audio observation sequence into a first sequence of hidden representations, the mapping being generated by a first encoder corresponding to the first language and mapping the audio observation sequence into a second sequence of hidden representations, the mapping being generated by a second encoder corresponding to the second language. The approach may further include generating a label-to-frame sequence based on the first sequence of hidden representations and the second sequence of hidden representations, using a joint neural network based model.
Absstract of: US20260030910A1
In a computer-implemented workflow, a submission of an asset localized for a first location is received. The asset may be intended for dissemination to a second location. A trained neural network is applied to the asset to determine a probability of recommending localization of the asset for the second location. This determination can be based on a plurality of features indicating contextual aspects of a document, which are identified in accordance with a plurality of transformations performed on the asset utilizing the trained neural network. Responsive to determining that the probability satisfies a condition, such as being a percentage above a threshold value, a recommendation is provided to exclude the asset from being localized to the second location.
Absstract of: US20260030500A1
A system for processing ultrasound images utilizes a trained orientation neural network to provide orientation information for a multiplicity of images captured around a body part, orienting each image with respect to a canonical view. In one aspect, the system includes a set creator and a generative neural network. The set creator generates sets of images and their associated transformations over time. The generative neural network then produces a summary canonical view set from these sets, showing changes during a body part cycle. In another aspect, the system includes a volume reconstructer. The volume reconstructer uses the orientation information to generate a volume representation of the body part from the oriented images using tomographic reconstruction, and to generate a canonical image from that volume representation.
Absstract of: US2025259062A1
A method for training a shape optimization neural network to produce an optimized point cloud defining desired shapes of materials with given properties is provided. The method comprises collecting a subject point cloud including points identified by their initial coordinates and material properties and jointly training a first neural network to iteratively modify a shape boundary by changing coordinates of a set of points in the subject point cloud to maximize an objective function and a second neural network to solve for physical fields by satisfying partial differential equations imposed by physics of the different materials of the subject point cloud having a shape produced by the changed coordinates output by the first neural network. The method also comprises outputting optimized coordinates of the set of points in the subject point cloud, produced by the trained first neural network.
Absstract of: US20260024188A1
A vision analytics and validation (VAV) system for providing an improved inspection of robotic assembly, the VAV system comprising a trained neural network three-way classifier, to classify each component as good, bad, or do not know, and an operator station configured to enable an operator to review an output of the trained neural network, and to determine whether a board including one or more “bad” or a “do not know” classified components passes review and is classified as good, or fails review and is classified as bad. In one embodiment, a retraining trigger to utilize the output of the operator station to train the trained neural network, based on the determination received from the operator station.
Absstract of: US20260024249A1
An apparatus to facilitate augmenting temporal anti-aliasing with a neural network for history validation is disclosed. The apparatus includes a set of processing resources configured to perform augmented temporal anti-aliasing (TAA), the set of processing resources including circuitry configured to: receive, at a history validation neural network, inputs for a current pixel of a current frame and a reprojected pixel corresponding to the current pixel, the reprojected pixel originating from history data of the current frame; generate, using an output of the history validation neural network, a validated color for the current pixel based on current color data corresponding to the current pixel and history color data corresponding to the reprojected pixel; render an output frame using the validated color; and add the output frame to the history data.
Absstract of: US20260023971A1
One or more systems, devices, computer program products and/or computer-implemented methods of use provided herein relate to predicting an optimized result for a graph neural network (GNN). A system can comprise a memory configured to store computer executable components; and a processor configured to execute the computer executable components stored in the memory, wherein the computer executable components comprise: a fusion component that that models non-linear modality correlations within and across entities through Hirschfeld-Gebelein-Re'nyi maximal correlation (MaxCorr) embeddings that generates a multi-graph that preserves identities of modalities and entities; and a multi-graph neural network (MGNN) component for task-informed reasoning in multi-graphs, that learns parameters defining entity-modality graph connectivity and message passing in an end-to-end fashion.
Absstract of: US20260023950A1
Techniques are disclosed that enable generating jointly probable output by processing input using a multi-stream recurrent neural network transducer (MS RNN-T) model. Various implementations include generating a first output sequence and a second output sequence by processing a single input sequence using the MS RNN-T, where the first output sequence is jointly probable with the second output sequence. Additional or alternative techniques are disclosed that enable generating output by processing multiple input sequences using the MS RNN-T. Various implementations include processing a first input sequence and a second input sequence using the MS RNN-T to generate output. In some implementations, the MS RNN-T can be used to process two or more input sequences to generate two or more jointly probable output sequences.
Nº publicación: KR20260009627A 20/01/2026
Applicant:
고려대학교산학협력단
Absstract of: US20260018162A1
A method for learning an artificial neural network based on speech imagination biosignals and phoneme information according to one embodiment of the present disclosure may comprise the steps of collecting speech imagination biosignals; labeling the collected speech imagination biosignals with phoneme information; pre-processing the labeled speech imagination biosignals; extracting feature vectors of the pre-processed speech imagination biosignals; and learning the extracted feature vectors through an artificial neural network to generate a classification model, wherein the pre-processing includes windowing to cut the labeled speech imagination biosignals in phoneme units, and the learning includes labeling a phoneme information for the feature vectors extracted in phoneme units.