news 2026/4/18 5:28:00

向量数据库迎来高性能部署选项,支持更苛刻工作负载

作者头像

张小明

前端开发工程师

1.2k 24
文章封面图
向量数据库迎来高性能部署选项,支持更苛刻工作负载

Vector database startup Pinecone Systems Inc. today announced a new, high-performance deployment option for customers that need to support the most demanding enterprise use cases.
向量数据库初创公司Pinecone Systems Inc.今日宣布推出一款全新的高性能部署选项,旨在满足需要支持最苛刻企业用例的客户需求。

It’s called Dedicated Read Nodes or DRN, and it’s now available in public preview, giving customers access to reserved capacity for low-latency queries with predictable performance and cost. The company explained that DRNs allow it to support a wider range of use cases that have extreme but variable performance requirements.
该选项名为“专用读取节点”(DRN),目前已开放公开预览,使客户能够使用预留容量进行低延迟查询,并获得可预测的性能和成本。该公司解释说,DRN使其能够支持性能要求极端且多变的更广泛用例。

Pinecone is the creator of an advanced vector database that can dynamically store, transform and index billions of high-dimensional data points, enabling it to respond rapidly and accurately to queries such as nearest-neighbor search.
Pinecone是一家先进向量数据库的创建者,该数据库能够动态存储、转换和索引数十亿个高维数据点,从而能够快速准确地响应诸如最近邻搜索之类的查询。

Unlike relational databases, which store data in rows and columns, vector databases represent unstructured data as high-dimensional data points, each representing a vector or an array of numbers. One of the primary functions of a vector database is to perform similarity searches, which can quickly find vectors that are most similar to a given query vector using measures such as cosine similarity or Euclidean distance. Vector databases are seen as essential for artificial intelligence workloads, as large language models need rapid access to vast amounts of unstructured data.
与以行和列存储数据的关系型数据库不同,向量数据库将非结构化数据表示为高维数据点,每个点代表一个向量或数字数组。向量数据库的主要功能之一是执行相似性搜索,可以通过余弦相似度或欧几里得距离等度量方法,快速找到与给定查询向量最相似的向量。向量数据库被视为人工智能工作负载的关键,因为大型语言模型需要快速访问海量的非结构化数据。

In a blog post, Pinecone explained that AI systems have complex requirements. Some applications, such as RAG, AI agents, model prototypes and scheduled jobs have “bursty” workloads, where they maintain a low and steady flow of traffic most of the time, before suddenly bursting into life when there are spikes in query volume. In such cases, Pinecone’s standard on-demand database is ideal, providing a combination of simplicity, elasticity and usage-based pricing.
在一篇博客文章中,Pinecone解释说,AI系统有着复杂的需求。一些应用,例如RAG、AI智能体、模型原型和计划任务,具有“突发性”工作负载特征,即大部分时间维持较低且稳定的流量,但在查询量激增时会突然活跃起来。对于这种情况,Pinecone标准的按需数据库是理想选择,它结合了简单性、弹性和基于使用量的定价。

However, some applications require consistent high throughput, operate at larger scales and can be extremely sensitive to latency. For instance, billion-vector-scale semantic searches, real-time recommendation systems and user-facing assistants with tight service-level objectives demand a more consistent level of performance, along with predictable costs at scale.
然而,另一些应用则需要持续的高吞吐量、大规模运行,并且对延迟极其敏感。例如,数十亿向量规模的语义搜索、实时推荐系统以及具有严格服务水平目标的面向用户的助手,都要求更稳定的性能水平以及大规模下可预测的成本。

Better performance without limits
突破极限的更好性能

This is why Pinecone is introducing DRNs, a new deployment option where queries run on isolated, provisioned nodes that are dedicated to these kinds of workloads. With these nodes, the data stays “warm” in the system’s memory and on a local solid-state drive.
正因如此,Pinecone推出了DRN。这是一种新的部署选项,查询将在专门为此类工作负载分配的、隔离的预配置节点上运行。通过这些节点,数据在系统内存和本地固态硬盘中保持“温热”状态。

这意味着可以快速访问数据而无需“冷启动”——冷启动是由于需要先从对象存储中获取信息而导致的。由于节点专用于每个工作负载,因此不存在“吵闹邻居”、共享队列和查询限制的问题。

DRNs scale along two dimensions, with replicas ensuring maximum throughput and availability to improve resilience, and shards used to expand storage capacity. Users can add as many replicas and shards as they desire to ensure their workloads can scale. To ensure predictable costs, pricing is based on an hourly rate per node.
DRN沿着两个维度进行扩展:副本确保最大的吞吐量和可用性以提高弹性,分片用于扩展存储容量。用户可以根据需要添加任意数量的副本和分片,以确保其工作负载能够扩展。为了保证成本可预测,定价基于每个节点的每小时费率。

Pinecone said customers will benefit from the lowest possible latency and guaranteed high throughput to ensure more consistent performance for high query-per-second workloads. DRNs can also scale indefinitely, and the company further claims that customers will see lower, more predictable costs compared to its on-demand nodes, which are based on a per-request pricing model.
Pinecone表示,客户将受益于尽可能低的延迟和有保障的高吞吐量,从而为高每秒查询量的工作负载确保更稳定的性能。DRN还可以无限扩展,该公司进一步声称,与基于按请求定价模型的按需节点相比,客户将看到更低、更可预测的成本。

DRNs are a deployment option for the most demanding use cases, where companies require performance isolation, predictable low-latency under heavy loads and linear scaling as demand grows. In addition to billion vector-scale search and recommendation systems, DRNs can also be useful for mission-critical AI applications, large enterprise or multitenant platforms that require isolation to prevent one workload impacting on another, and other applications that need performance at scale.
DRN是为最苛刻的用例设计的一种部署选项,适用于那些需要性能隔离、重负载下可预测的低延迟以及随着需求增长能线性扩展的公司。除了数十亿向量规模的搜索和推荐系统外,DRN也适用于关键任务型AI应用、需要隔离以防止工作负载相互影响的大型企业或多租户平台,以及其他需要大规模性能的应用。

Pinecone said its DRNs have proven their reliability under real-world conditions for several early adopters. One customer is using DRN to support metadata-filtered real-time媒体搜索 on its design platform, and was able to sustain 600-queries-per-second performance with latency of just 45 milliseconds across 135 million vectors. The same customer also pushed it to the limit, running a load test that saw its node reach an impressive 2,200 queries per second with a P50 latency of just 60 milliseconds.
Pinecone表示,其DRN已经在对几家早期采用者的实际环境测试中证明了其可靠性。一位客户正在其设计平台上使用DRN来支持基于元数据过滤的实时媒体搜索,能够在处理1.35亿个向量时,维持每秒600次查询的性能,延迟仅为45毫秒。该客户还进行了极限测试,运行负载测试使其节点达到了令人印象深刻的每秒2200次查询,P50延迟仅为60毫秒。

In another example, a customer running a large e-commerce marketplace deployed its recommendation engine on Pinecone’s DRNs to support 5,700 queries per second with a P50 latency of just 26 milliseconds across a database of 1.4 billion vectors.
在另一个例子中,一家运营大型电子商务市场的客户在Pinecone的DRN上部署了其推荐引擎,以支持在包含14亿向量的数据库上实现每秒5700次查询,P50延迟仅为26毫秒。
更多精彩内容 请关注我的个人公众号 公众号(办公AI智能小助手)或者 我的个人博客 https://blog.qife122.com/
对网络安全、黑客技术感兴趣的朋友可以关注我的安全公众号(网络安全技术点滴分享)

版权声明: 本文来自互联网用户投稿,该文观点仅代表作者本人,不代表本站立场。本站仅提供信息存储空间服务,不拥有所有权,不承担相关法律责任。如若内容造成侵权/违法违规/事实不符,请联系邮箱:809451989@qq.com进行投诉反馈,一经查实,立即删除!
网站建设 2026/4/12 20:29:24

国内用户福音:DDColor提供多个镜像站点保障稳定访问

国内用户福音:DDColor提供多个镜像站点保障稳定访问 在数字时代,一张泛黄的老照片往往承载着几代人的记忆。然而,这些黑白影像常因岁月侵蚀而模糊、褪色,甚至布满划痕。过去,修复它们需要专业摄影师耗时数小时手动上色…

作者头像 李华
网站建设 2026/4/3 2:29:18

Bloomberg News数据支持:提供行业洞察换取曝光

ms-swift:大模型全栈开发的“瑞士军刀” 在今天的大模型时代,一个开发者最常问的问题可能是:“我有想法,也有数据,但怎么才能快速把模型跑起来?” 这背后反映的是现实困境:动辄上百GB的显存需求…

作者头像 李华
网站建设 2026/4/1 15:10:35

大众点评商户头像焕新:老字号店铺老logo上色服务

大众点评商户头像焕新:老字号店铺老logo上色服务 在本地生活服务平台日益注重用户体验的今天,一个清晰、生动且富有情感温度的商户头像,往往能成为用户点击进入页面的第一推动力。尤其对于那些拥有数十年甚至上百年历史的老字号来说&#xff…

作者头像 李华
网站建设 2026/4/13 14:24:45

GitCode项目推荐位申请:获取官方首页曝光机会

ms-swift 与“一锤定音”:让大模型开发真正走向普惠 在今天,几乎每个开发者都听说过大模型——但真正跑通一次推理、完成一次微调的人,可能连十分之一都不到。不是不想学,而是太难上手:环境配置动辄几个小时&#xff0…

作者头像 李华
网站建设 2026/4/17 10:43:44

“比较宪法”20260101

规则(推荐定稿) 只有 I64 允许直接比较:> < == != 语义:连续物理量、可排序量(mm、ms、计数、差值…) U64 及其他类型:只允许 == !=(严格相等/不等) 相似/近似/命中:一律走“距离/相似度”通道(海明/L1/L2/余弦…),但是否支持由特征类型策略决定 VecI64:L…

作者头像 李华
网站建设 2026/4/15 18:57:12

网盘直链下载助手支持迅雷、IDM等多种工具

网盘直链下载助手支持迅雷、IDM等多种工具 在AI模型和大型数据集分发日益频繁的今天&#xff0c;开发者常面临一个尴尬局面&#xff1a;好不容易找到了一份开源的老照片修复镜像&#xff0c;点开网盘链接却提示“下载速度受限为100KB/s”——几个GB的文件得等上大半天。更别提中…

作者头像 李华